[jira] [Commented] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming

2016-04-25 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256291#comment-15256291
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11345:


During a sequential repair.

> Assertion Errors "Memory was freed" during streaming
> 
>
> Key: CASSANDRA-11345
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11345
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jean-Francois Gosselin
>Assignee: Paulo Motta
>
> We encountered the following AssertionError (twice on the same node) during a 
> repair :
> On node /172.16.63.41
> {noformat}
> INFO  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
> StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Session with /10.174.216.160 is complete  
>   
> WARN  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
> StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Stream failed   
> ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 
> StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Streaming error occurred
> java.lang.AssertionError: Memory was freed
>   
>
> at 
> org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
>   
> at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
>  
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) 
> ~[apache-cassandra-2.1.13.jar:2.1.13] 
> 
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]  
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]  
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]   
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>  ~[apache-cassandra-2.1.13.jar:2.1.13]   
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
>   
>
> {noformat} 
> On node /10.174.216.160
>  
> {noformat}   
> ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 
> StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
> Streaming error occurred  
> java.io.IOException: Connection reset by peer 
>   
>
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65] 
>   
>
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
> ~[na:1.7.0_65]
>   
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
> ~[na:1.7.0_65]
>   
> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65] 
>   
>
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) 
> ~[na:1.7.0_65]  

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2016-03-23 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208597#comment-15208597
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10769:


Based on the comments in CASSANDRA-9935 the "AssertionError: row DecoratedKey" 
is still present in 2.1.13.

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> INFO  [AntiEntropySessions:415] 2015-11-25 06:28:21,533 
> RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new 
> session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range 
> (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, 
> user_stats, user_device, user_quota, user_store, user_device_progress, 
> entity_by_id2]
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[AntiEntropySessions:414,5,RMI Runtime]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
> [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_80]

[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-20 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199874#comment-15199874
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


Same issue as CASSANDRA-9117 but not fixed in 2.1.x ?

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Francois Gosselin updated CASSANDRA-11374:
---
Comment: was deleted

(was: We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a 
subrange repair (we are not using incremental repair).)

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)
Jean-Francois Gosselin created CASSANDRA-11374:
--

 Summary: LEAK DETECTED during repair
 Key: CASSANDRA-11374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
 Project: Cassandra
  Issue Type: Bug
Reporter: Jean-Francois Gosselin


When running a range repair we are seeing the following LEAK DETECTED errors:

{noformat}
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
 was not released before the reference was garbage collected
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201920#comment-15201920
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


We are using the Reaper from Spotify 
https://github.com/spotify/cassandra-reaper, so subrange repair . We are not 
using incremental repair.

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair

2016-03-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201915#comment-15201915
 ] 

Jean-Francois Gosselin commented on CASSANDRA-11374:


We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a 
subrange repair (we are not using incremental repair).

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming

2016-03-11 Thread Jean-Francois Gosselin (JIRA)
Jean-Francois Gosselin created CASSANDRA-11345:
--

 Summary: Assertion Errors "Memory was freed" during streaming
 Key: CASSANDRA-11345
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11345
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: Jean-Francois Gosselin


We encountered the following AssertionError (twice on the same node) during a 
repair :

On node /172.16.63.41

{noformat}
INFO  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Session with /10.174.216.160 is complete

WARN  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 
StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Stream failed   
ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 
StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] 
Streaming error occurred
java.lang.AssertionError: Memory was freed  

   
at 
org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) 
~[apache-cassandra-2.1.13.jar:2.1.13]   

at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) 
~[apache-cassandra-2.1.13.jar:2.1.13]   
   
at 
org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) 
~[apache-cassandra-2.1.13.jar:2.1.13]   
  
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
 ~[apache-cassandra-2.1.13.jar:2.1.13]  
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
 ~[apache-cassandra-2.1.13.jar:2.1.13]  
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
 ~[apache-cassandra-2.1.13.jar:2.1.13]   
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
 ~[apache-cassandra-2.1.13.jar:2.1.13]   
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]  

   
{noformat} 

On node /10.174.216.160
 
{noformat}   
ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 StreamSession.java:505 
- [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred   
   
java.io.IOException: Connection reset by peer   

   
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65]   

   
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
~[na:1.7.0_65]  

at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
~[na:1.7.0_65]  

at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65]   

   
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) 
~[na:1.7.0_65]  
 
at 
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
 ~[apache-cassandra-2.1.13.jar:2.1.13] 
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMes

[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2016-02-22 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157028#comment-15157028
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

I will give it a try.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-19 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154789#comment-15154789
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

We are doing range repair with https://github.com/spotify/cassandra-reaper . We 
don't use incremental repair .  We also see the issue with :  nodetool repair 
-pr

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concur

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-12 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145336#comment-15145336
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] What's the next step to troubleshoot this issue ? Any specific log we 
could enable at DEBUG  ?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
>

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143400#comment-15143400
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

Ok from 172.16.63.39, same error "received out of order wrt DecoratedKey" :

{noformat}
ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,512 Validator.java:245 - 
Failed creating a merkle tree for [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b 
on foo/bar, (-5525881226490706160,-5525442713957813067]], /10.174.216.158 (see 
log for details)
ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,516 CassandraDaemon.java:223 
- Exception in thread Thread[ValidationExecutor:118,1,main]
java.lang.AssertionError: row DecoratedKey(-5525725068665570338, 
0010e3a74bf82717394598e2b7421c89382e250265336137346266382d323731372d333934352d393865322d62373432316338393338326510f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
 received out of order wrt DecoratedKey(-5525444669477674618, 
0010581499f0b99337e1bf468611fd0233e4250235383134393966302d623939332d333765312d626634362d3836313166643032653410f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-86240400663737

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143307#comment-15143307
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

Here's a new one with no clear message from the exception :

{noformat}
INFO  [AntiEntropyStage:1] 2016-02-11 17:21:20,947 RepairSession.java:171 - 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] Received merkle tree for bar 
from /10.53.10.30
ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,033 RepairSession.java:303 - 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,034 CassandraDaemon.java:223 
- Exception in thread Thread[AntiEntropySessions:28,5,RMI Runtime]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
~[na:1.7.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.9.jar:2.1.9]
... 3 common frames omitted
ERROR [Thread-20728] 2016-02-11 17:21:21,034 StorageService.java:2966 - Repair 
session d78e02b0-d0e3-11e5-a04a-4ffa10ef584b for range 
(-5525881226490706160,-5525442713957813067] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, 
(-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.7.0_65]
at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
[na:1.7.0_65]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2957)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.9.jar:2.1.9]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143270#comment-15143270
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] Yesterday we ran nodetool scrub on all the nodes and restarted the 
nodes. No luck we're still getting "received out of order wrt DecoratedKey" . 
Any suggestions for the next step ?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
>

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141341#comment-15141341
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

No, we haven't seen this WARN. The only thing we haven't tried is a node 
restart (based on you comment above " ... The latter may be fixed by restarting 
the node." ) . Although I'm not sure it will fix the problem since we've used 
C* 2.1.9 from the beginning.


> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.u

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141245#comment-15141245
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim]  The WARN message should be in the C* log or on the stdout of nodetool 
?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apa

[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub

2016-02-10 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140965#comment-15140965
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10769:


We are also seeing this issue in our multi datacenters cluster (3 DCs), C* 
2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the error 
keeps coming back .

How can we get into this state ?

{noformat}
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - 
Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de 
on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 
(see log for details)
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 
CassandraDaemon.java:223 - Exception in thread 
Thread[ValidationExecutor:5884,1,main]
java.lang.AssertionError: row DecoratedKey(5126475305931285312, 
00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400)
 received out of order wrt DecoratedKey(5128167525973821686, 
00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}

> "received out of order wrt DecoratedKey" after scrub
> 
>
> Key: CASSANDRA-10769
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10769
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.11, Debian Wheezy
>Reporter: mlowicki
>
> After running scrub and cleanup on all nodes in single data center I'm 
> getting:
> {code}
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - 
> Failed creating a merkle tree for [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for 
> details)
> ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 
> CassandraDaemon.java:227 - Exception in thread 
> Thread[ValidationExecutor:103,1,main]
> java.lang.AssertionError: row DecoratedKey(-5867787467868737053, 
> 000932373633313036313204808800) received out of order wrt 
> DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700)
> at org.apache.cassandra.repair.Validator.add(Validator.java:127) 
> ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> {code}
> What I did is to run repair on other node:
> {code}
> time nodetool repair --in-local-dc
> {code}
> Corresponding log on the node where repair has been started:
> {code}
> ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 
> RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] 
> session completed with the following error
> org.apache.cassandra.exceptions.RepairException: [repair 
> #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, 
> (-5867793819051725444,-5865919628027816979]] Validation failed in 
> /10.210.3.117
> at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
>  ~[apac

[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2016-02-09 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139500#comment-15139500
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

It's not fixed. I end up adding a catch for the AssertionError  in the 
GraphiteReporter as a workaround.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-02-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134437#comment-15134437
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9935:
---

[~yukim] We are also seeing this issue in our multi datacenters cluster (3 
DCs), C* 2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the 
error keeps coming back . 

We did have some network glitch, as [~mlowicki] was saying, can it be related 
to network issues ? 

{noformat}
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - 
Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de 
on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 
(see log for details)
ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 
CassandraDaemon.java:223 - Exception in thread 
Thread[ValidationExecutor:5884,1,main]
java.lang.AssertionError: row DecoratedKey(5126475305931285312, 
00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400)
 received out of order wrt DecoratedKey(5128167525973821686, 
00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00)
at org.apache.cassandra.repair.Validator.add(Validator.java:126) 
~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615)
 ~[apache-cassandra-2.1.9.jar:2.1.9]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
{noformat}


> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749

[jira] [Commented] (CASSANDRA-10502) Cassandra query degradation with high frequency updated tables

2016-01-06 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086025#comment-15086025
 ] 

Jean-Francois Gosselin commented on CASSANDRA-10502:


[~thobbs] Have you tried to dump the data for this key with sstable2json ? 

> Cassandra query degradation with high frequency updated tables
> --
>
> Key: CASSANDRA-10502
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10502
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dodong Juan
>Priority: Minor
>  Labels: perfomance, query, triage
> Fix For: 2.2.x
>
>
> Hi,
> So we are developing a system that computes profile of things that it 
> observes.  The observation comes in form of events. Each thing that it 
> observe has an id and each thing has a set of subthings in it which has 
> measurement of some kind. Roughly there are about 500 subthings within each 
> thing. We receive events containing measurements of these 500 subthings every 
> 10 seconds or so.
> So as we receive events, we  read the old profile value, calculate the new 
> profile based on the new value and save it back. 
> One of the things we observe are the processes running on the server.
> We use the following schema to hold the profile. 
> {noformat}
> CREATE TABLE processinfometric_profile (
> profilecontext text,
> id text,
> month text,
> day text,
> hour text,
> minute text,
> command text,
> cpu map,
> majorfaults map,
> minorfaults map,
> nice map,
> pagefaults map,
> pid map,
> ppid map,
> priority map,
> resident map,
> rss map,
> sharesize map,
> size map,
> starttime map,
> state map,
> threads map,
> user map,
> vsize map,
> PRIMARY KEY ((profilecontext, id, month, day, hour, minute), command)
> ) WITH CLUSTERING ORDER BY (command ASC)
> AND bloom_filter_fp_chance = 0.1
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {noformat}
> This profile will then be use for certain analytics that can use in the 
> context of the ‘thing’ or in the context of specific thing and subthing. 
> A profile can be defined as monthly, daily, hourly. So in case of monthly the 
> month will be set to the current month (i.e. ‘Oct’) and the day and hour will 
> be set to empty ‘’ string.
> The problem that we have observed is that over time (actually in just a 
> matter of hours) we will see a huge degradation of query response  for the 
> monthly profile. At the start it will be respinding in 10-100 ms and after a 
> couple of hours it will go to 2000-3000 ms . If you leave it for a couple of 
> days you will start experiencing readtimeouts . The query is basically just :
> {noformat}
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' 
> and minute=''
> {noformat}
> This will have only about 500 rows or so.
> We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the 
> issue to no avail. And since this is a test, we are running on a single node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting

2015-10-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943468#comment-14943468
 ] 

Jean-Francois Gosselin edited comment on CASSANDRA-9625 at 10/5/15 2:51 PM:


I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276  try
278  {
279 CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280 assert metadata != null : sstable.getFilename();
281 if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}





was (Author: jfgosselin):
I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276try
278{
279CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280assert metadata != null : sstable.getFilename();
281if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}




> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-10-05 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943468#comment-14943468
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When 
SSTableReader.getApproximateKeyCount is called, how can I get in a state where 
the CompactionMetadata is null ?

{code:title=SSTableReader.java|borderStyle=solid}
276try
278{
279CompactionMetadata metadata = (CompactionMetadata) 
sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, 
MetadataType.COMPACTION);
280assert metadata != null : sstable.getFilename();
281if (cardinality == null)
{code}

{noformat}
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
 
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27)
 
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) 
at 
com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235)
 
at 
com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) 
{noformat}




> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-26 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715194#comment-14715194
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~benedict] Can this issue be reopened ? 

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-25 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711887#comment-14711887
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~tjake] I think that I've found the issue. When the Gauge metric for 
CompressionMetadataOffHeapMemoryUsed is called,  the following method is called 
in org.apache.cassandra.io.util.Memory :

{code:title=org.apache.cassandra.io.util.Memory.java|borderStyle=solid}
public long size()
{
assert peer != 0;
return size;
}
{code}
and for some reason peer was 0. After the AssertionError the metrics graphite 
reporter thread is no longer executed.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-13 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695853#comment-14695853
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

[~tjake] I can easily reproduce the issue after ~ 12h, 
com.yammer.metrics.reporting at DEBUG didn't provide anything . Any specific 
places where I should add traces in GraphiteRepoter ?

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2015-08-11 Thread Jean-Francois Gosselin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682117#comment-14682117
 ] 

Jean-Francois Gosselin commented on CASSANDRA-9625:
---

We are seeing this issue on 2.1.8.

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Fix For: 2.1.x
>
> Attachments: metrics.yaml, thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)