[jira] [Commented] (CASSANDRA-14555) Verify effect of CASSANDRA-14252 on streaming endpoint selection
[ https://issues.apache.org/jira/browse/CASSANDRA-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534153#comment-16534153 ] Dikang Gu commented on CASSANDRA-14555: --- Thanks [~jay.zhuang] ! > Verify effect of CASSANDRA-14252 on streaming endpoint selection > > > Key: CASSANDRA-14555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14555 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Sam Tunnicliffe >Priority: Major > Fix For: 4.x > > > CASSANDRA-14252 makes a slight change to {{DynamicEndpointSnitch}} so that it > is somewhat more likely a replica in a remote DC is contacted when replicas > in the local DC are considered degraded. This seems reasonable on the read > path, but it could also affect selection of endpoints for streaming and cross > DC streaming is probably something that operators want to control more > tightly. To be clear, I’m not 100% sure that this is actually an issue, but > I’d like to have some investigation into it before we ship a change to > default behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14555) Verify effect of CASSANDRA-14252 on streaming endpoint selection
[ https://issues.apache.org/jira/browse/CASSANDRA-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533906#comment-16533906 ] Jay Zhuang commented on CASSANDRA-14555: Reverted. cc. [~dikanggu] > Verify effect of CASSANDRA-14252 on streaming endpoint selection > > > Key: CASSANDRA-14555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14555 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Sam Tunnicliffe >Priority: Major > Fix For: 4.x > > > CASSANDRA-14252 makes a slight change to {{DynamicEndpointSnitch}} so that it > is somewhat more likely a replica in a remote DC is contacted when replicas > in the local DC are considered degraded. This seems reasonable on the read > path, but it could also affect selection of endpoints for streaming and cross > DC streaming is probably something that operators want to control more > tightly. To be clear, I’m not 100% sure that this is actually an issue, but > I’d like to have some investigation into it before we ship a change to > default behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14555) Verify effect of CASSANDRA-14252 on streaming endpoint selection
[ https://issues.apache.org/jira/browse/CASSANDRA-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532946#comment-16532946 ] Aleksey Yeschenko commented on CASSANDRA-14555: --- That might be true, but I share [~beobal]'s concern here. I don't think it was perfectly right to introduce the default change so late in the minor cycle in 3.0.17. Can we revert from 3.0 and 3.11 please? And apply the suggested mitigations to trunk? Thanks. > Verify effect of CASSANDRA-14252 on streaming endpoint selection > > > Key: CASSANDRA-14555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14555 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Sam Tunnicliffe >Priority: Major > Fix For: 4.x > > > CASSANDRA-14252 makes a slight change to {{DynamicEndpointSnitch}} so that it > is somewhat more likely a replica in a remote DC is contacted when replicas > in the local DC are considered degraded. This seems reasonable on the read > path, but it could also affect selection of endpoints for streaming and cross > DC streaming is probably something that operators want to control more > tightly. To be clear, I’m not 100% sure that this is actually an issue, but > I’d like to have some investigation into it before we ship a change to > default behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14555) Verify effect of CASSANDRA-14252 on streaming endpoint selection
[ https://issues.apache.org/jira/browse/CASSANDRA-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532046#comment-16532046 ] Jay Zhuang commented on CASSANDRA-14555: Thanks [~beobal] to point out the issue. I think CASSANDRA-14252 is a further fix for CASSANDRA-13074 which fixes the same issue as CASSANDRA-2662: {quote}[~brandon.williams]: Given coordinator A, and replicas X, Y, and Z (in subsnitch order), on the first round X will be chosen, and let's say it receives a score of 1. With the patch, at this point Y and Z will be initialized with zero. On the next round, Y will be chosen, and let's say it receives a score of or near 1, depending on network latency. On the third round, Z will be chosen, and let's say it also receives a score similar to Y. Now the cache is hot on all nodes, and subsequent reads have the possibility to oscillate between all three based on network latency variance. This can be mitigated though with the badness threshold. With the badness threshold on, the first round will occur as before, but subsequent rounds will continue to use X until it degrades past the threshold, at which point they will use Y, until the dynamic snitch reset()s, at which point everything will repeat. I don't think this is harmful to CASSANDRA-1314 after all. https://issues.apache.org/jira/browse/CASSANDRA-2662?focusedCommentId=13035597=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13035597 {quote} It's definitely a valid point that data streaming may prefer a local node (by {{subsnitch}}) to an unknown node. In the example above, let's say replica Z is in another DC, steaming may not want to select Z, unless X and Y are really bad. But there's a chance that it could be selected (even without CASSANDRA-14252) when there's no score for Z. So maybe we could introduce another {{sortByProximity()}} to exclude the zero score endpoints for streaming, or have a different {{dynamic_snitch_badness_threshold}} for streaming (with a [change|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java#L251] to support zero scores). Overall, I don't think CASSANDRA-14252 should be reverted and it should not block the {{3.0.17}} and {{3.11.3}} releases. cc. [~jkni], [~tjake], [~brandon.williams]. > Verify effect of CASSANDRA-14252 on streaming endpoint selection > > > Key: CASSANDRA-14555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14555 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Sam Tunnicliffe >Priority: Major > Fix For: 4.x > > > CASSANDRA-14252 makes a slight change to {{DynamicEndpointSnitch}} so that it > is somewhat more likely a replica in a remote DC is contacted when replicas > in the local DC are considered degraded. This seems reasonable on the read > path, but it could also affect selection of endpoints for streaming and cross > DC streaming is probably something that operators want to control more > tightly. To be clear, I’m not 100% sure that this is actually an issue, but > I’d like to have some investigation into it before we ship a change to > default behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14555) Verify effect of CASSANDRA-14252 on streaming endpoint selection
[ https://issues.apache.org/jira/browse/CASSANDRA-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531435#comment-16531435 ] Michael Shuler commented on CASSANDRA-14555: Linked CASSANDRA-14252 and also noticed there is an uncommitted dtest in a comment: [https://github.com/apache/cassandra-dtest/compare/master...cooldoger:14252] cc: [~dikanggu] and [~jay.zhuang] > Verify effect of CASSANDRA-14252 on streaming endpoint selection > > > Key: CASSANDRA-14555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14555 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Sam Tunnicliffe >Priority: Major > Fix For: 4.x > > > CASSANDRA-14252 makes a slight change to {{DynamicEndpointSnitch}} so that it > is somewhat more likely a replica in a remote DC is contacted when replicas > in the local DC are considered degraded. This seems reasonable on the read > path, but it could also affect selection of endpoints for streaming and cross > DC streaming is probably something that operators want to control more > tightly. To be clear, I’m not 100% sure that this is actually an issue, but > I’d like to have some investigation into it before we ship a change to > default behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org