[ 
https://issues.apache.org/jira/browse/CASSANDRA-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597362#comment-14597362
 ] 

Stefania commented on CASSANDRA-9102:
-------------------------------------

Thanks for your input. I agree with you on testing races in the kitchen sink 
harness. The reason why I used parallel threads in TestAccuracy is to have the 
test complete in a reasonable amount of time. Each thread uses different 
partitions.

As for unit testing, I attached code coverage results after running some 
relevant dtests: *consistecy_test.py, paxos_test.py, batch_test.py, 
counter_test.py and secondary_indexes_test.py*.

I analysed the files you mentioned and here are some quick observations:

||File||Percentage||Missing coverage||
|AbstractReadExecutor|97%|RetryType.ALWAYS|
|ReadCallback|66%|Exception handling or generation (ReadTimeout, ReadFailure, 
DigestMismatch)|
|StorageProxy|56%|Exception handling (timeouts, failures), hinting due to 
exception handling and max limit of hints reached, logged batches and triggers, 
some estimation of ranges due to indexes and range slices, describe cluster (a 
nodetool command), truncation of data, paxos contentions, jmx methods (but I 
think we have some nodetool tests I did not run and they are trivial)|
|AbstractRowResolver|90%|OK|
|RowDigestResolver|88%|OK|
|RowDataResolver|79%|replyCount == 1|
|RangeSliceResponseResolver|19%|Everything|

The percentages aren't totally correct because they refer to the main class, 
but the missing components were analysed by looking at the entire files. Also, 
it assumes there was no contention between different Cassandra processes 
sharing the same jacoco coverage file, but in this case the coverage would have 
been better than reported.

I think we can easily add dtests to cover logged batching, triggers, range 
slices and single replica replies. For the handling of exceptions, we would 
have to add some debug options to instruct a process to timeout or return a 
failure. It's easy to do this via a system property and it already exists for 
write failures, but this requires restarting the node. The alternative would be 
to use JMX.

However, even with timeout and failure generation, I am not sure we would be 
able to test the paxos contentions very well. Plus dtests are always slower and 
harder to debug than unit tests. Perhaps we should byte the bullet and adopt a 
mock framework for MessagingService, at which point it should be possible to 
test StorageProxy via unit tests.

What do you think?

> Consistency levels such as non-local quorum need better tests
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-9102
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9102
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Ariel Weisberg
>            Assignee: Stefania
>         Attachments: jacoco.diff, jacoco.tar.gz
>
>
> We didn't catch unit testing for this functionality. There is dtest 
> consistency_test but it doesn't cover non-local functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to