Wei Deng created CASSANDRA-12184:
------------------------------------

             Summary: incorrect compaction log information on totalSourceRows 
in C* pre-3.8 versions
                 Key: CASSANDRA-12184
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12184
             Project: Cassandra
          Issue Type: Bug
          Components: Compaction, Observability
            Reporter: Wei Deng
            Priority: Minor


I was looking at some confusing compaction log information on C* 3.0.7 and 
realized that we have a bug that was trivially fixed in C* 3.8.
Basically here is the log entry in debug.log (as most of the compaction related 
log have been moved to debug.log due to adjustment in CASSANDRA-10241).

{noformat}
DEBUG [CompactionExecutor:6] 2016-07-12 15:38:28,471  CompactionTask.java:217 - 
Compacted (96aa1ba6-4846-11e6-adb7-17866fa8ddfd) 4 sstables to 
[/var/lib/cassandra/data/keyspace1/standard1-713f7920484411e6adb717866fa8ddfd/mb-5-big,]
 to level=0.  267,974,735 bytes to 78,187,400 (~29% of original) in 39,067ms = 
1.908652MB/s.  0 total partitions merged to 332,904.  Partition merge counts 
were {1:9008, 2:34822, 3:74505, 4:214569, }
DEBUG [CompactionExecutor:4] 2016-07-12 20:51:56,578  CompactionTask.java:217 - 
Compacted (786cd9d0-4872-11e6-8755-79a37e6d8141) 4 sstables to 
[/var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/mb-25-big,]
 to level=0.  620 bytes to 498 (~80% of original) in 51ms = 0.009312MB/s.  0 
total partitions merged to 6.  Partition merge counts were {1:4, 3:2, }
DEBUG [CompactionExecutor:4] 2016-07-12 20:51:58,345  CompactionTask.java:217 - 
Compacted (79771de0-4872-11e6-8755-79a37e6d8141) 4 sstables to 
[/var/lib/cassandra/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mb-65-big,]
 to level=0.  14,113 bytes to 9,553 (~67% of original) in 70ms = 0.130149MB/s.  
0 total partitions merged to 16.  Partition merge counts were {1:13, 2:2, 3:1, }
DEBUG [CompactionExecutor:3] 2016-07-12 20:52:00,415  CompactionTask.java:217 - 
Compacted (7ab6a2c0-4872-11e6-8755-79a37e6d8141) 4 sstables to 
[/var/lib/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/mb-85-big,]
 to level=0.  1,066 bytes to 611 (~57% of original) in 48ms = 0.012139MB/s.  0 
total partitions merged to 16.  Partition merge counts were {1:13, 2:2, 4:1, }
DEBUG [CompactionExecutor:4] 2016-07-12 20:52:00,442  CompactionTask.java:217 - 
Compacted (7abae880-4872-11e6-8755-79a37e6d8141) 4 sstables to 
[/var/lib/cassandra/data/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-77-big,]
 to level=0.  6,910 bytes to 4,396 (~63% of original) in 48ms = 0.087341MB/s.  
0 total partitions merged to 16.  Partition merge counts were {1:13, 2:2, 3:1, }
{noformat}

Note no matter if it's system table or user table, it's always showing "0 total 
partitions merged to xx", which is incorrect information due to this code 
segment 
https://github.com/apache/cassandra/blob/cassandra-3.0.7/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L215-217.
 Basically it only initialized the totalSourceRows value with 0 and never 
assigned a real calculated value to it. Looks like the latest commit from 
CASSANDRA-12080 fixed this problem, but it only got checked into 3.8 branch.

Since this can make people doubt the accuracy of compaction related log 
entries, and the changes made in CASSANDRA-12080 are only log metrics related, 
low impact changes, I'd advocate we backport the change from CASSANDRA-12080 
into C*-3.0 branch as many people's production C*-3.0 version can benefit from 
the bug fix, along with better compaction log information in general. I realize 
that CASSANDRA-12080 may be based on the C*-3.6 changes in CASSANDRA-10805, so 
this means we may have to bring in changes from CASSANDRA-10805 as well if 
CASSANDRA-12080 cannot be cleanly rebased on C*-3.0 branch, but both are going 
to benefit compaction observability in production on C*-3.0.x versions, so both 
should be welcomed changes in C*-3.0 branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to