[ 
https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726617#comment-13726617
 ] 

Yuki Morishita commented on CASSANDRA-2698:
-------------------------------------------

Benedict,

Thanks for the update.

bq. 1) I'm not sure what you mean by not "serializing those" 

I meant you don't have to send back to the coordinator.
Changing serialized format means we have to bump up messaging version defined 
in MessagingService.
2.0 got feature freeze, so I think it's better to wait until next minor version 
for message change.

Also, I looked at the change made to MerkleTree#differenceHelper, and I'm still 
not sure how row count helps improve logic.
What is the difference from just using hash value?

bq. 5) One thing we might want to consider changing is the format of the 
EstimatedHistogram ranges in the log messages

Yeah, especially -1  in (-1, 0] feels weird. How about omitting lower bound 
from the label and output like:

{code}
 ~0: xxx
~10: xxx
~20: xxx
{code}

nit: you should surround whole output logic in Validator#compele with 
`logger.isDebugEnabled` check
                
> Instrument repair to be able to assess it's efficiency (precision)
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2698
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Benedict
>            Priority: Minor
>              Labels: lhf
>         Attachments: nodetool_repair_and_cfhistogram.tar.gz, 
> patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff, 
> patch.taketwo.forreview.diff
>
>
> Some reports indicate that repair sometime transfer huge amounts of data. One 
> hypothesis is that the merkle tree precision may deteriorate too much at some 
> data size. To check this hypothesis, it would be reasonably to gather 
> statistic during the merkle tree building of how many rows each merkle tree 
> range account for (and the size that this represent). It is probably an 
> interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to