[ 
https://issues.apache.org/jira/browse/CASSANDRA-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857338#action_12857338
 ] 

Jonathan Ellis commented on CASSANDRA-981:
------------------------------------------

Stu suggested Vivaldi positioning 
(http://swtch.com/~rsc/talks/vivaldi-ccs.pdf), but that is actually solving a 
different problem.  (How do you estimate network distance, without actually 
talking to the node in question?)  Since we are in near-constant communication 
with all the nodes in the system, we should be able to leverage that for a 
solution that is both simpler and quicker to adapt to changing conditions.

This problem seems similar to the sliding window we maintain of heartbeat times 
for failure detection, but this seems too heavyweight to apply when processing 
thousands of responses per second.  Perhaps restricting the sampling to "at 
most N per second" would work, although I'm not sure if the phi algorithm can 
work on samples that don't occur at evenly-spaced intervals.


> Dynamic endpoint snitch
> -----------------------
>
>                 Key: CASSANDRA-981
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-981
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.7
>
>
> An endpoint snitch that automatically and dynamically infers "distance" to 
> other machines without having to explicitly configure rack and datacenter 
> positions solves two problems:
> The killer feature here is adapting to things like compaction or a 
> failing-but-not-yet-dead disk.  This is important, since when we are doing 
> reads we pick the "closest" replica for actually reading data from (and only 
> read md5s from other replicas).  This means that if the closest replica by 
> network topology is temporarily slow due to compaction (for instance), we'll 
> have to block for its reply even if we get the other replies much much faster.
> Not having to manually re-sync your configuration with your network topology 
> when changes (adding machines) are made is a nice bonus.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to