[ https://issues.apache.org/jira/browse/CASSANDRA-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115520#comment-14115520 ]
Brandon Williams edited comment on CASSANDRA-7239 at 8/29/14 5:23 PM: ---------------------------------------------------------------------- It appears the root of the problem here is that we have a serious bookkeeping problem. We only increment totalDiskSpaceUsed in DataTracker.replaceFlushed, which is to say that we only count flushed memtables. This means if you do a compaction, even if the result isn't negative, your reported load is completely wrong: {noformat} root@bw-2:/srv/cassandra# du -sh /var/lib/cassandra/data/Keyspace1/ 163M /var/lib/cassandra/data/Keyspace1/ root@bw-2:/srv/cassandra# bin/nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.208.35.225 33.16 MB 256 100.0% ac9b6dd7-233e-4cd5-a7e0-bc9564871704 rack1 {noformat} (33MB because that's the size this machine likes to flush at, and there was one flush after the compaction.) It looks like we need to increment somewhere in compaction as well, probably when CompactionTask completes. This area isn't my forte, though. was (Author: brandon.williams): It appears the root of the problem here is that we have a serious bookkeeping problem. We only increment totalDiskSpaceUsed in DataTracker.replaceFlushed, which is to say that we only count flushed memtables. This means if you do a compaction, even if the result isn't negative, your reported load is completely wrong: {noformat} root@bw-2:/srv/cassandra# du -sh /var/lib/cassandra/data/Keyspace1/ 163M /var/lib/cassandra/data/Keyspace1/ root@bw-2:/srv/cassandra# bin/nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.208.35.225 33.16 MB 256 100.0% ac9b6dd7-233e-4cd5-a7e0-bc9564871704 rack1 {noformat} (33MB because that's the size this machine likes to flush at, and there was one flush after the compaction.) It looks like we need to increment somewhere in compaction as well, probably CompactionTask completes. This area isn't my forte, though. > Nodetool Status Reports Negative Load With VNodes Disabled > ---------------------------------------------------------- > > Key: CASSANDRA-7239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7239 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: 1000 Nodes EC2 m1.large ubuntu 12.04 > Reporter: Russell Alexander Spitzer > Priority: Critical > Fix For: 2.1.0 > > Attachments: nodetool.png, opscenter.png > > > When I run stress on a large cluster without vnodes (num_token =1 initial > token set) The loads reported by nodetool status are negative, or become > negative after stress is run. > {code} > UN 10.97.155.31 -447426217 bytes 1 0.2% > 8d40568c-044c-4753-be26-4ab62710beba rack1 > > UN 10.9.132.53 -447342449 bytes 1 0.2% > 58e7f255-803d-493b-a19e-58137466fb78 rack1 > > UN 10.37.151.202 -447298672 bytes 1 0.2% > ba29b1f1-186f-45d0-9e59-6a528db8df5d rack1 > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)