[ https://issues.apache.org/jira/browse/CASSANDRA-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114218#comment-14114218 ]
Brandon Williams commented on CASSANDRA-7239: --------------------------------------------- Here's one way to repro fairly easily: stick an assert/throw in org.apache.cassandra.db.DataTracker.spaceReclaimed after it decrements, then do some stress inserts. Took around 4M for me to trigger it: {noformat} INFO 19:26:13 Enqueuing flush of Standard1: 87030528 (33%) on-heap, 0 (0%) off-heap INFO 19:26:13 Writing Memtable-Standard1@1502715954(24933920 serialized bytes, 566680 ops, 33%/0% of on/off-heap limit) INFO 19:26:15 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-14-Data.db (31847416 bytes) for commitlog position ReplayPosition(segmentId=1409253765886, position=18443872) INFO 19:26:15 Enqueuing flush of compactions_in_progress: 1334 (0%) on-heap, 0 (0%) off-heap INFO 19:26:15 Writing Memtable-compactions_in_progress@1683103686(144 serialized bytes, 9 ops, 0%/0% of on/off-heap limit) INFO 19:26:15 Completed flushing /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-6-Data.db (167 bytes) for commitlog position ReplayPosition(segmentId=1409253765887, position=857003) INFO 19:26:15 Compacting [SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-12-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-13-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-11-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-14-Data.db')] INFO 19:26:18 Enqueuing flush of Standard1: 87028992 (33%) on-heap, 0 (0%) off-heap INFO 19:26:18 Writing Memtable-Standard1@1478442071(24930840 serialized bytes, 566610 ops, 33%/0% of on/off-heap limit) INFO 19:26:20 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-16-Data.db (31843482 bytes) for commitlog position ReplayPosition(segmentId=1409253765887, position=21153003) INFO 19:26:23 Enqueuing flush of Standard1: 87039744 (33%) on-heap, 0 (0%) off-heap INFO 19:26:23 Writing Memtable-Standard1@708372436(24935900 serialized bytes, 566725 ops, 33%/0% of on/off-heap limit) INFO 19:26:25 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-17-Data.db (31849945 bytes) for commitlog position ReplayPosition(segmentId=1409253765888, position=23869152) INFO 19:26:28 Enqueuing flush of Standard1: 87031296 (33%) on-heap, 0 (0%) off-heap INFO 19:26:28 Writing Memtable-Standard1@1583340894(24930840 serialized bytes, 566610 ops, 33%/0% of on/off-heap limit) INFO 19:26:29 Enqueuing flush of compactions_in_progress: 180 (0%) on-heap, 0 (0%) off-heap INFO 19:26:29 Writing Memtable-compactions_in_progress@1333345401(0 serialized bytes, 1 ops, 0%/0% of on/off-heap limit) INFO 19:26:29 Completed flushing /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-7-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=1409253765890, position=1491303) INFO 19:26:30 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-18-Data.db (31843482 bytes) for commitlog position ReplayPosition(segmentId=1409253765889, position=26577952) INFO 19:26:30 Compacted 4 sstables to [/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-15,]. 127,390,788 bytes to 127,390,788 (~100% of original) in 15,199ms = 7.993244MB/s. 453,348 total partitions merged to 453,348. Partition merge counts were {1:453348, } INFO 19:26:33 Enqueuing flush of Standard1: 87026688 (33%) on-heap, 0 (0%) off-heap INFO 19:26:33 Writing Memtable-Standard1@159807468(24931940 serialized bytes, 566635 ops, 33%/0% of on/off-heap limit) INFO 19:26:34 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-19-Data.db (31844887 bytes) for commitlog position ReplayPosition(segmentId=1409253765890, position=29289191) INFO 19:26:34 Enqueuing flush of compactions_in_progress: 1334 (0%) on-heap, 0 (0%) off-heap INFO 19:26:34 Writing Memtable-compactions_in_progress@1430076737(144 serialized bytes, 9 ops, 0%/0% of on/off-heap limit) INFO 19:26:34 Completed flushing /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-8-Data.db (172 bytes) for commitlog position ReplayPosition(segmentId=1409253765891, position=7686763) INFO 19:26:34 Compacting [SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-5-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-8-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-6-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-7-Data.db')] INFO 19:26:34 Compacting [SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-17-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-19-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-16-Data.db'), SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-18-Data.db')] ERROR 19:26:34 Exception in thread Thread[NonPeriodicTasks:1,5,main] java.lang.RuntimeException: disk space is negative at org.apache.cassandra.db.DataTracker.spaceReclaimed(DataTracker.java:422) ~[main/:na] at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:96) ~[main/:na] at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:655) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_60] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) ~[na:1.7.0_60] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_60] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_60] INFO 19:26:34 Compacted 4 sstables to [/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-9,]. 423 bytes to 172 (~40% of original) in 51ms = 0.003216MB/s. 4 total partitions merged to 1. Partition merge counts were {1:2, 2:1, } {noformat} And indeed nodetool is reporting negative load: {noformat} Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.208.35.225 -481904132 bytes 256 100.0% 38dd30d1-de65-43dd-9bcb-b0cd018f050c rack1 {noformat} So it looks like we have some kind of race between decrementing and incrementing. > Nodetool Status Reports Negative Load With VNodes Disabled > ---------------------------------------------------------- > > Key: CASSANDRA-7239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7239 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: 1000 Nodes EC2 m1.large ubuntu 12.04 > Reporter: Russell Alexander Spitzer > Priority: Minor > Fix For: 2.1.1 > > Attachments: nodetool.png, opscenter.png > > > When I run stress on a large cluster without vnodes (num_token =1 initial > token set) The loads reported by nodetool status are negative, or become > negative after stress is run. > {code} > UN 10.97.155.31 -447426217 bytes 1 0.2% > 8d40568c-044c-4753-be26-4ab62710beba rack1 > > UN 10.9.132.53 -447342449 bytes 1 0.2% > 58e7f255-803d-493b-a19e-58137466fb78 rack1 > > UN 10.37.151.202 -447298672 bytes 1 0.2% > ba29b1f1-186f-45d0-9e59-6a528db8df5d rack1 > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)