[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2013-01-10 Thread Janne Jalkanen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549412#comment-13549412
 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:58 AM:


I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight 
upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard 
IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems 
random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded. Environment is Ubuntu Linux 
12.04 LTS, JVM is OpenJDK 7u9.

Last repair picked 497 invalid counter shards, and we have approximately 8 
million counters, of which about a hundred is incremented each second (and 
sometimes subtracted from if our read repair kicks in - we have our own in-app 
repair for certain low values).  All the counter writes are batched with 100 
increments/batch.  So this is only affecting a really small subset, though it's 
rather annoying when it happens, as it means that you can never really trust 
the counters to be even in the ballpark :-/

  was (Author: jalkanen):
I'm seeing this while running repair -pr. Three-cluster node, RF 3. 
Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid 
shard IDs, counts differ by more than one - sometimes even by 3000 or more.  
Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded. Environment is Ubuntu Linux 
12.04 LTS, JVM is OpenJDK 7u9.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2013-01-09 Thread Janne Jalkanen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549412#comment-13549412
 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:48 AM:


I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight 
upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard 
IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems 
random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, upgrade, restart on every node in a 
rolling fashion.  Then I did upgradesstables and repair -pr on every node when 
the entire cluster had been upgraded.

  was (Author: jalkanen):
I'm seeing this while running repair -pr. Three-cluster node, RF 3. 
Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid 
shard IDs, counts differ by more than one - sometimes even by 3000 or more.  
Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.

I did disablegossip, disablethrift, drain, upgrade, restart on every node in a 
rolling fashion.  Then I did upgradesstables and repair -pr on every node when 
the entire cluster had been upgraded.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2013-01-09 Thread Janne Jalkanen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549412#comment-13549412
 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:48 AM:


I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight 
upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard 
IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems 
random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded.

  was (Author: jalkanen):
I'm seeing this while running repair -pr. Three-cluster node, RF 3. 
Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid 
shard IDs, counts differ by more than one - sometimes even by 3000 or more.  
Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, upgrade, restart on every node in a 
rolling fashion.  Then I did upgradesstables and repair -pr on every node when 
the entire cluster had been upgraded.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2013-01-09 Thread Janne Jalkanen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549412#comment-13549412
 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:49 AM:


I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight 
upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard 
IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems 
random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded. Environment is Ubuntu Linux 
12.04 LTS.

  was (Author: jalkanen):
I'm seeing this while running repair -pr. Three-cluster node, RF 3. 
Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid 
shard IDs, counts differ by more than one - sometimes even by 3000 or more.  
Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2013-01-09 Thread Janne Jalkanen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549412#comment-13549412
 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:50 AM:


I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight 
upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard 
IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems 
random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded. Environment is Ubuntu Linux 
12.04 LTS, JVM is OpenJDK 7u9.

  was (Author: jalkanen):
I'm seeing this while running repair -pr. Three-cluster node, RF 3. 
Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid 
shard IDs, counts differ by more than one - sometimes even by 3000 or more.  
Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* 
increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every 
node in a rolling fashion.  Then I did upgradesstables and repair -pr on every 
node when the entire cluster had been upgraded. Environment is Ubuntu Linux 
12.04 LTS.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2012-11-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492245#comment-13492245
 ] 

Mck SembWever edited comment on CASSANDRA-4417 at 11/7/12 10:20 AM:


Sylvain, here's log from one node. For most of the log we were running 1.0.8. 
And then at line 2883399 we upgraded (and this was the first node to upgrade) 
to 1.1.6.

The error msg comes every few seconds.
Our counters are sub-columns inside supercolumns.
We completed the upgrade on all nodes. Then restarted again (because jna was 
missing).

We are now running upgradesstables but that's not in this logfile. The error 
msgs still appear.

An operational problem we've had recently is that we had one node down for ~one 
month (faulty raid controller) and when we finally brought the node back into 
the cluster nightly repairs would never finish. In the end we just disabled 
nightly repairs (we don't have tombstones) with the plan that an upgrade and 
upgradesstables would bring us back to a state where repairs would work again. 
I have no idea if this can be related. 

  was (Author: michaelsembwever):
Sylvain, here's log from one node. For most of the log we were running 
1.0.8. And then at line 2883399 we upgraded (and this was the first node to 
upgrade) to 1.1.6.

The error msg comes every few seconds.
Our counters are sub-columns inside supercolumns.
We completed the upgrade on all nodes. Then restarted again (because jna was 
missing).

We are now running upgradesstables but that's not in this logfile. The error 
msgs still appear.

An operational problem we're had recently is that we had one node down for ~one 
month (faulty raid controller) and when we finally brought the node back into 
the cluster nightly repairs would never finish. In the end we just disabled 
nightly repairs (we don't have tombstones) with the plan that an upgrade and 
upgradesstables would bring us back to a state where repairs would work again. 
I have no idea if this can be related. 
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy
 Attachments: cassandra-mck.log.bz2, err.txt


 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2012-10-26 Thread Eric Lubow (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485333#comment-13485333
 ] 

Eric Lubow edited comment on CASSANDRA-4417 at 10/27/12 2:22 AM:
-

We are getting this on DSE 2.2 (C* 1.1.5) on a new node during bootstrap.  We 
upgraded the cluster from C* 1.0.10 about 10 days ago and upgradesstables was 
run on every node and we repaired the entire cluster.  We ran We've been 
getting this error sporadically on various nodes at various points but it's not 
consistent.  I've double and triple checked every node looking for sstable 
files named *- hd -* and I don't see any (assuming that's enough to tell that 
the sstable has been upgraded.  If this error is an effect of requiring one to 
run upgradesstables, then how would it happen during a bootstrap? All nodes 
involved in this cluster are 1.1.5.

  was (Author: elubow):
We are getting this on DSE 2.2 (C* 1.1.5) on a new node during bootstrap.  
We upgraded the cluster from C* 1.0.10 about 10 days ago and upgradesstables 
was run on every node and we repaired the entire cluster.  We ran We've been 
getting this error sporadically on various nodes at various points but it's not 
consistent.  I've double and triple checked every node looking for sstable 
files named *-hd-* and I don't see any (assuming that's enough to tell that 
the sstable has been upgraded.  If this error is an effect of requiring one to 
run upgradesstables, then how would it happen during a bootstrap? All nodes 
involved in this cluster are 1.1.5.
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy

 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2012-09-11 Thread Omid Aladini (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453537#comment-13453537
 ] 

Omid Aladini edited comment on CASSANDRA-4417 at 9/12/12 10:18 AM:
---

{quote}
A simple workaround is to use batch commit log, but that has a potentially 
important performance impact.
{quote}

I'm a bit confused why batch commit would solve the problem. If cassandra 
crashes before the batch is fsynced, the counter mutations in the batch which 
it was the leader for will still be lost although they might have been applied 
on other replicas. The difference would be that the mutations won't be 
acknowledged to the client, and since counters aren't idempotent, the client 
won't know weather to retry or not. Am I missing something?

  was (Author: omid):
{quote}
A simple workaround is to use batch commit log, but that has a potentially 
important performance impact.
{quote}

I'm a bit confused why batch commit would solve the problem. If cassandra 
crashes before the batch is fsynced, the counter mutations which it was the 
leader for will still be lost although they might have been applied on other 
replicas. The difference would be that the mutations won't be acknowledged to 
the client, and since counters aren't idempotent, the client won't know weather 
to retry or not. Am I missing something?
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy

 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2012-09-07 Thread Charles Brophy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13450766#comment-13450766
 ] 

Charles Brophy edited comment on CASSANDRA-4417 at 9/8/12 3:35 AM:
---

We have a six node cluster with even key range balance, random partitioner, and 
with replication factor=2. I get these errors immediately following running 
nodetool repair but ONLY if a streaming repair happens as a result. We are 
serving live updates to our counters from our clickstream. My guess is that the 
sstable being streamed between the servers winds up becoming out of date for 
the duration of the streaming process and ends up containing these duplicates 
that are vetted during the subsequent compaction. In any case, for us it is 
100% reproducible via: nodetool repair - streaming repair - subsequent 
compaction. Let me know if you need more details. Hope this helps!

  was (Author: charlesb_zulily):
We have a six node cluster with even key range balance, random partitioner, 
and with relication factor=2. I get these errors immediately following running 
nodetool repair but ONLY if a streaming repair happens as a result. We are 
serving live updates to our counters from our clickstream. My guess is that the 
sstable being streamed between the servers winds up becoming out of date for 
the duration of the streaming process and ends up containing these duplicates 
that are vetted during the subsequent compaction. In any case, for us it is 
100% reproducible via: nodetool repair - streaming repair - subsequent 
compaction. Let me know if you need more details. Hope this helps!
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy

 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected

2012-09-07 Thread Charles Brophy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13450766#comment-13450766
 ] 

Charles Brophy edited comment on CASSANDRA-4417 at 9/8/12 3:39 AM:
---

We have a six node cluster [1.1.3, jdk 1.6.33, CentOs 6] with even key range 
balance, random partitioner, and with replication factor=2. I get these errors 
immediately following running nodetool repair but ONLY if a streaming repair 
happens as a result. We are serving live updates to our counters from our 
clickstream. My guess is that the sstable being streamed between the servers 
winds up becoming out of date for the duration of the streaming process and 
ends up containing these duplicates that are vetted during the subsequent 
compaction. In any case, for us it is 100% reproducible via: nodetool repair - 
streaming repair - subsequent compaction. Let me know if you need more 
details. Hope this helps!

  was (Author: charlesb_zulily):
We have a six node cluster with even key range balance, random partitioner, 
and with replication factor=2. I get these errors immediately following running 
nodetool repair but ONLY if a streaming repair happens as a result. We are 
serving live updates to our counters from our clickstream. My guess is that the 
sstable being streamed between the servers winds up becoming out of date for 
the duration of the streaming process and ends up containing these duplicates 
that are vetted during the subsequent compaction. In any case, for us it is 
100% reproducible via: nodetool repair - streaming repair - subsequent 
compaction. Let me know if you need more details. Hope this helps!
  
 invalid counter shard detected 
 ---

 Key: CASSANDRA-4417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1
 Environment: Amazon Linux
Reporter: Senthilvel Rangaswamy

 Seeing errors like these:
 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and 
 (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick 
 highest to self-heal; this indicates a bug or corruption generated a bad 
 counter shard
 What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira