[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900365#comment-13900365 ] Chris Burroughs commented on CASSANDRA-6683: bq. I think it would make more sense (and fix Kirill's case) to always call sortByProximityWithScore() and then compare that ordering against the subsnitch list. FWIW Everyone I have shown this code to thought that's what it did based on the description and then spent a lot of time being puzzled when they realized it didn't. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899388#comment-13899388 ] Brandon Williams commented on CASSANDRA-6683: - I see what you meant. Yeah, I think it was done the way it is was as an optimization, though as you said it's probably not a huge one. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899375#comment-13899375 ] Tyler Hobbs commented on CASSANDRA-6683: bq. I'm not sure what you mean exactly, we always end up calling it, we just don't check badness when it's set to zero. I was a little confused on my last comment, but let me try again. Right now we only call {{sortByProximityWithScore()}} if {{BADNESS_THRESHOLD != 0}} and two neighbors in the list returned by the subsnitch differ by BADNESS_THRESHOLD. I think it would make more sense (and fix Kirill's case) to always call {{sortByProximityWithScore()}} and then compare that ordering against the subsnitch list. Something like this: {noformat} defaultOrder = subsnitch.sort(address, addresses); scoredOrder = sortByProximityWithScore(address, addresses); // make this return a new list instead of sorting in place for (int i = 0; i < defaultOrder.size(); i++) { if (scores.get(defaultOrder.get(i)) > scores.get(scoredOrder.get(i)) * (1 + BADNESS_THRESHOLD)) return scoredOrder; } return defaultOrder; {noformat} bq. Possible, but it'd be a lot of work, because it would change the snitch interface and we'd still need the old call because not all uses of it have a consistency level available. It looks like there aren't too many callers, so it shouldn't be that much work. I would just make the arg optional and default it to the length of {{addresses}}. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899295#comment-13899295 ] Brandon Williams commented on CASSANDRA-6683: - bq. was our motivation for not calling sortByProximityWithScore every time just the overhead of that operation I'm not sure what you mean exactly, we always end up calling it, we just don't check badness when it's set to zero. bq. perhaps we could add a parameter that specifies how many of the replicas will be used (based on the consistency level) Possible, but it'd be a lot of work, because it would change the snitch interface and we'd still need the old call because not all uses of it have a consistency level available. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899277#comment-13899277 ] Tyler Hobbs commented on CASSANDRA-6683: [~brandon.williams] was our motivation for not calling {{sortByProximityWithScore}} every time just the overhead of that operation? It seems like it shouldn't have a large impact unless the RF is high. If we want to handle the high-RF case more efficiently, perhaps we could add a parameter that specifies how many of the replicas will be used (based on the consistency level) and just move the N lowest scores to the front if the first N scores aren't within BADNESS_THRESHOLD. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899130#comment-13899130 ] Brandon Williams commented on CASSANDRA-6683: - You could test by disabling the dynamic snitch with {{noformat}}dynamic_snitch: false{{noformat}} so the sorting is always the same. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898928#comment-13898928 ] Kirill Bogdanov commented on CASSANDRA-6683: Thank you for your answer. I have came across this part of code in DES because I have observed suboptimal choice of nodes in my configuration and started to investigate it. This is my config: * PropertyFileSnitch * dynamic_snitch_badness_threshold 0.1, * 4 DCs, * keyspace with replication quota 1 for each DC. * Read repair and speculative_retry are disabled for my tables. * Performing read operations with consistency TWO I am observing that local DC that serves read request has about the same probability of asking any of the 3 remote replicas to confirm consistency TWO regardless of their score (is that correct?). Since all nodes are in different DCs {{subsnitch.sortByProximity}} places local node at the start of the list (first) but does not sort other remote DCs. After {{subsnitch.sortByProximity}} addresses list with scores may look something like that: - DC1: 0.1 (first) - DC2: 0.7 - DC3: 0.2 - DC4: 0.2 Since we are not calling {{sortByProximityWithScore}} we returning this list to {{AbstractReadExecutor getReadExecutor}} where {{consistencyLevel.filterForQuery}} (based on consistency TWO) picks up first 2 addresses from the list. As a result we are sending read request to suboptimal DC2. By implementing my change ({{Math.abs()}}) I am seeing ~15% read throughput improvement in my setup with cassandra stress tool. Due to my limited knowledge of Cassandra internals I am probably wrong to blame DES and BADNESS_THRESHOLD, but I would greatly appreciate if you could point out what is the correct behaviour in the situation above and which module is responsible for sorting nodes by the scores. Thank you. > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898320#comment-13898320 ] Tyler Hobbs commented on CASSANDRA-6683: CASSANDRA-6465 doesn't affect what [~kirill.sc] is referring to. However, when the score for {{first}} is less than {{next}}, the DES is behaving correctly. Although {{first - next / first}} results in a negative number, that result is less than {{BADNESS_THRESHOLD}}, which results in the DES saying that {{first}} should be used. This is the correct behavior because {{first}} has a lower score (less "badness") than {{next}}. The whole point of this check is that the DES should only use something other than {{first}} if the score for {{next}} is lower (less bad) than the score for {{first}} by a certain margin (BADNESS_THRESHOLD). > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.6 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13896632#comment-13896632 ] Brandon Williams commented on CASSANDRA-6683: - I don't think this should happen after CASSANDRA-6465, wdyt [~thobbs]? > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.5 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6683) BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch
[ https://issues.apache.org/jira/browse/CASSANDRA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13896462#comment-13896462 ] Chris Burroughs commented on CASSANDRA-6683: Earlier DES problems reported in CASSANDRA-6465 > BADNESS_THRESHOLD does not working correctly with DynamicEndpointSnitch > --- > > Key: CASSANDRA-6683 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6683 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.8.0-33-generic >Reporter: Kirill Bogdanov > Labels: snitch > Fix For: 2.0.5 > > > There is a problem in *DynamicEndpointSnitch.java* in > sortByProximityWithBadness() > Before calling sortByProximityWithScore we comparing each nodes score ratios > to the badness threshold. > {code} > if ((first - next) / first > BADNESS_THRESHOLD) > { > sortByProximityWithScore(address, addresses); > return; > } > {code} > This is not always the correct comparison because *first* score can be less > than *next* score and in that case we will compare a negative number with > positive. > The solution is to compute absolute value of the ratio: > {code} > if (Math.abs((first - next) / first) > BADNESS_THRESHOLD) > {code} > This issue causing an incorrect sorting of DCs based on their performance and > affects performance of the snitch. > Thanks. > -- This message was sent by Atlassian JIRA (v6.1.5#6160)