[ https://issues.apache.org/jira/browse/CASSANDRA-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597362#comment-14597362 ]
Stefania commented on CASSANDRA-9102: ------------------------------------- Thanks for your input. I agree with you on testing races in the kitchen sink harness. The reason why I used parallel threads in TestAccuracy is to have the test complete in a reasonable amount of time. Each thread uses different partitions. As for unit testing, I attached code coverage results after running some relevant dtests: *consistecy_test.py, paxos_test.py, batch_test.py, counter_test.py and secondary_indexes_test.py*. I analysed the files you mentioned and here are some quick observations: ||File||Percentage||Missing coverage|| |AbstractReadExecutor|97%|RetryType.ALWAYS| |ReadCallback|66%|Exception handling or generation (ReadTimeout, ReadFailure, DigestMismatch)| |StorageProxy|56%|Exception handling (timeouts, failures), hinting due to exception handling and max limit of hints reached, logged batches and triggers, some estimation of ranges due to indexes and range slices, describe cluster (a nodetool command), truncation of data, paxos contentions, jmx methods (but I think we have some nodetool tests I did not run and they are trivial)| |AbstractRowResolver|90%|OK| |RowDigestResolver|88%|OK| |RowDataResolver|79%|replyCount == 1| |RangeSliceResponseResolver|19%|Everything| The percentages aren't totally correct because they refer to the main class, but the missing components were analysed by looking at the entire files. Also, it assumes there was no contention between different Cassandra processes sharing the same jacoco coverage file, but in this case the coverage would have been better than reported. I think we can easily add dtests to cover logged batching, triggers, range slices and single replica replies. For the handling of exceptions, we would have to add some debug options to instruct a process to timeout or return a failure. It's easy to do this via a system property and it already exists for write failures, but this requires restarting the node. The alternative would be to use JMX. However, even with timeout and failure generation, I am not sure we would be able to test the paxos contentions very well. Plus dtests are always slower and harder to debug than unit tests. Perhaps we should byte the bullet and adopt a mock framework for MessagingService, at which point it should be possible to test StorageProxy via unit tests. What do you think? > Consistency levels such as non-local quorum need better tests > ------------------------------------------------------------- > > Key: CASSANDRA-9102 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9102 > Project: Cassandra > Issue Type: Test > Reporter: Ariel Weisberg > Assignee: Stefania > Attachments: jacoco.diff, jacoco.tar.gz > > > We didn't catch unit testing for this functionality. There is dtest > consistency_test but it doesn't cover non-local functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)