[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242828#comment-13242828 ] Vijay commented on CASSANDRA-3722: -- wfm +1 on both the v5 and nt PROXYHISTOGRAMS > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1-V2.patch, > 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, > 0001-CASSANDRA-3723-A2-Patch.patch, > 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt, > 3722-v5.txt > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242802#comment-13242802 ] Brandon Williams commented on CASSANDRA-3722: - bq. is it because the lack of a datapoint, isn't taken into account as slowness? Exactly. It's not receiving new data, so the score doesn't change and the dead host is still rated the best until the FD removes it as an option. Doing it this way, time itself penalizes the host when it stops responding. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1-V2.patch, > 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, > 0001-CASSANDRA-3723-A2-Patch.patch, > 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242800#comment-13242800 ] Jonathan Ellis commented on CASSANDRA-3722: --- Why doesn't dsnitch incorporate this automatically? is it because the lack of a datapoint, isn't taken into account as slowness? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1-V2.patch, > 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, > 0001-CASSANDRA-3723-A2-Patch.patch, > 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242729#comment-13242729 ] Vijay commented on CASSANDRA-3722: -- Done some testing (single DC) and it works as expected +1 for v4. Thanks! > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1-V2.patch, > 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, > 0001-CASSANDRA-3723-A2-Patch.patch, > 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240524#comment-13240524 ] Brandon Williams commented on CASSANDRA-3722: - A few things while I continue to test: Compaction can occur before the gossiper is started (and indeed, while gossip is deactivated): {noformat} ERROR 23:38:09,192 Exception in thread Thread[CompactionExecutor:1,1,main] java.lang.AssertionError at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1105) at org.apache.cassandra.service.StorageService.reportSeverity(StorageService.java:790) at org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutor.beginCompaction(CompactionManager.java:1021) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:136) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:128) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} getConnectionManagers is unused, but I imagine that was to help with the hashcode business that we can now remove. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1-V2.patch, > 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3723-A2-Patch.patch, > 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237102#comment-13237102 ] Brandon Williams commented on CASSANDRA-3722: - We should probably avoid having Gossiper inject application states directly, if for nothing else than to not make life harder for CASSANDRA-3125 > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > Attachments: 0001-CASSANDRA-3722-A1.patch > > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236306#comment-13236306 ] Vijay commented on CASSANDRA-3722: -- I was almost complete with the patch but filtering based on pending queue can be potentially dangerous. On a Multi region cluster the pending commands in the local replicas will be almost always higher than remote ones because they dont receive reads. We might want to filter might not want to filter based on pending. We can do a hackie solution by padding the score of the remote DC's to be higher artificial value than the local DC, What do you guys think? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219639#comment-13219639 ] Pavel Yaskevich commented on CASSANDRA-3722: Also maintaining a kind of normalized load statistics for each node in the combination with pending requests could give a better view of what is going on on the node e.g. load is <= 0.5 but we have a big pending queue for/on the node - that could mean network failure. For the statistic we can assign each of the sub-routines "load impact" factor e.g. compaction 0.3, scrub - 0.2, read - 0.01, we can set the load threshold for "overloaded" nodes e.g. 0.85 (which could be adjusted at runtime) and sort hosts accordingly to their load + pending request statistics which would make penalizing hosts more precise. Obviously some normalization should be done because clusters won't always have nodes with identical processing capabilities (network, hardware etc.). > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217657#comment-13217657 ] Jonathan Ellis commented on CASSANDRA-3722: --- bq. Taking into account the number of outstanding requests is IMO a necessity It sounds like you're saying that if coordinator X has 1000 requests pending response from replica Y, and only 100 to replica Z, then X should suspect that Y is having trouble and rely more heavily on Z, even before requests to Y start timing out. Right? That sounds reasonable to me in theory. How do we mix that into the existing latency information we track for dsnitch? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217445#comment-13217445 ] Peter Schuller commented on CASSANDRA-3722: --- I'm -0 on the original bit of this ticket, but +1 on more generic changes that covers the original use case as good if not better anyway. I think that instead of trying to predict exactly the behavior of some particular event like compaction, we should just be better at actually responding to what is actually going on: * We have CASSANDRA-2540 which can help avoid blocking uselessly on a dropped or slow request even if we haven't had the opportunity to react to overall behavior yet (I have a partial patch that breaks read repair, I haven't had time to finish it). * Taking into account the number of outstanding requests is IMO a necessity. There is plenty of precedent for anyone who wants that (least used connections policies in various LB:s), but more importantly it would so clearly help in several situations, including: ** Sudden GC pause of a node ** Sudden death of a node ** Sudden page cache eviction and slowness of a node, before snitching figures it out ** Constantly overloaded node; even with the dynsnitch it would improve the situation as the number of requests affected by a dynsnitch reset is lessened ** Packet loss/hiccup/whatever across DC:s There is some potential for foot shooting in the sense that if a node is broken in a way that it responds with incorrect data, but responds faster than anyone else, it will tend to "swallow" all the traffic. But honestly, that feels like a minor concern to me based on what I've seen actually happen in production clusters. If we ever start sending non-successes back over inter-node RPC, this would change however. My only major concern is potential performance impacts of keeping track of the number of outstanding requests, but if that *does* become a problem one can make it probabilistic - have N % of all requests be tracked. Less impact, but also less immediate response to what's happening. This will also have the side-effect of mitigating sudden bursts of promotion into old-gen if we combine it with pro-actively dropping read-repair messages for nodes that are overloaded (effectively prioritizing data reads), hence helping for CASSANDRA-3853. {code} Should we T (send additional requests which are not part of the normal operations) the requests until the other node recovers? {code} In the absence of read repair, we'd have to do speculative reads as Stu has previously noted. With read repair turned on, this is not an issue because the node will still receive requests and eventually warm up. Only with read repair turned off do we not send requests to more than the first N of endpoints, with N being what is required by CL. Semi-relatedly, I think it would be a good idea to make the proximity sorting probabilistic in nature so that we don't do a binary flip back and fourth between who gets data vs. digest reads or who doesn't get reads at all. That might mitigate this problem, but not help fundamentally since the rate of warm-up would decrease with a node being slow. I do want to make this point though: *Every single production cluster* I have ever been involved with so far, has been such that you basically never want to turn read repair off. Not because of read repair itself, but because of the traffic it generates. Having nodes not receive traffic is extremely dangerous under most circumstances as it leaves nodes cold, only to suddenly explode and cause timeouts and other bad behavior as soon as e.g. some neighbor goes down and it suddenly starts taking traffic. This is an easy way to make production clusters fall over. If your workload is entirely in memory or otherwise not reliant on caching the problem is much less pronounced, but even then I would generally recommend that you keep it turned on if only because your nodes will have to be able to take the additional load *anyway* if you are to survive other nodes in the neighborhood going down. It just makes clusters much more easy to reason about. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know ab
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210704#comment-13210704 ] Vijay commented on CASSANDRA-3722: -- Another issue just came up which is related to this ticket. Where a node just completed bootstrap and hence it will not have the files cache warm enough for a faster response. Hence the scores for this node will be much lower score and hence there is a very less traffic going to that node. The problem is that unless there is more traffic sent the file caches will not warm up fast enough. Should we T (send additional requests which are not part of the normal operations) the requests until the other node recovers? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1.0 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188124#comment-13188124 ] Vijay commented on CASSANDRA-3722: -- How about instead of sending gossip message, in DES.updateScores we can also compare the number of waiting tasks to be performed on a given node, this way we can actually move on to the next node when there is a lot more pending data to be received before receiving the latencies... Makes sense? (Where pending is node y pending > (another node x 's pending + %) the prefer the node x) > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185319#comment-13185319 ] Brandon Williams commented on CASSANDRA-3722: - bq. Right, so I'm saying instead of gossiping an indirect indicator like compaction status, we could gossip read latency directly. I'm not sure that's a good solution either due to the propagation time. Especially in the case that the repair is staggered across the replica set, there's a good chance you're penalizing the wrong host after the first one until the state is propagated to you. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185316#comment-13185316 ] Vijay commented on CASSANDRA-3722: -- will do. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185217#comment-13185217 ] Jonathan Ellis commented on CASSANDRA-3722: --- bq. the problem with that is that until we send the traffic actual traffic to the suspected node we will not know it Right, so I'm saying instead of gossiping an indirect indicator like compaction status, we could gossip read latency directly. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185138#comment-13185138 ] Vijay commented on CASSANDRA-3722: -- Probably after CASSANDRA-3723 we might be able to include the wait time in the queue, but the problem with that is that until we send the traffic actual traffic to the suspected node we will not know it, we should actually send occasional requests to those nodes to get this data (Not sure how we can throttle to be fewer requests)... Something like Test/Real messages to see if they are back to normal SLA's (which does a complete end to end read from the disk), if we dont want to publish major events. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185131#comment-13185131 ] Brandon Williams commented on CASSANDRA-3722: - bq. you could base it on latency instead of pending count... I'm not sure I understand, do you mean the latency the pending reads would have if they came back at the time of calculating the scores? I was thinking about doing that since we have the timestamps in EM. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185125#comment-13185125 ] Jonathan Ellis commented on CASSANDRA-3722: --- you could base it on latency instead of pending count... > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184576#comment-13184576 ] Vijay commented on CASSANDRA-3722: -- but the pending will be almost zero for the bad nodes as we already redirected traffic, right? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184527#comment-13184527 ] Brandon Williams commented on CASSANDRA-3722: - The tricky thing is, we don't know how much to penalize them per pending read. We could choose an arbitrary static amount, and that would work as long as it's the same for all hosts... except the badness threshold loses meaning. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184516#comment-13184516 ] Jonathan Ellis commented on CASSANDRA-3722: --- bq. maybe penalizing hosts with pending reads would work better since it would work universally I like that idea. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184511#comment-13184511 ] Brandon Williams commented on CASSANDRA-3722: - Hmm, I see. Instead of communicating various states the node is in over gossip, maybe penalizing hosts with pending reads would work better since it would work universally. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184497#comment-13184497 ] Vijay commented on CASSANDRA-3722: -- Hi Brandon, i am talking about Reads when the RR is disabled (0.0) It is not going to be 1 or 2 reads (we would be really happy if it is only 1 to 2 reads) it will be as many reads it takes for the first read to come back, for example we have 1000 request per second work load when we see the node goes slower it might take a second or timeout (10 second) which means 1000 to 10k requests. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184475#comment-13184475 ] Brandon Williams commented on CASSANDRA-3722: - bq. If we have the end time we can avoid sending traffic back to the node until it recovers completely right? I'm not sure if you mean writes or checksum requests for read repair here, but neither is avoidable (except for RR by adjusting the probability setting.) If you mean the one or two reads after the reset interval, that doesn't seem like it's going to have a big enough impact to optimize for. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184427#comment-13184427 ] Vijay commented on CASSANDRA-3722: -- If we have thee end time we can avoid sending traffic back to the node untill it recovers completely right? If compaction are for a second Is the worst case is till ring delay we will not send traffic... alternatively we could do the difference from the response time avg and throttle the request... or even T the request just to conform if the even is complete Makes sense? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184289#comment-13184289 ] Brandon Williams commented on CASSANDRA-3722: - bq. Hi Brandon, Will it make sense for the remote node to avoid traffic for a given the start and end? Sure, but my point is that's going to happen already: * node X begins a repair, adversely affecting its read latency * node Y tries to read data from X, and scores it badly due to the extra latency At this point, node Y will not try to read from X again until either: * all other members of X's replica set perform *worse* than X * the RESET_INTERVAL_IN_MS elapses, and the entire process starts over again Eventually, node X finishes the validation compaction and when the reset interval elapses, its reads are more equal to the rest of the replica set and everything is back to normal. Telling the dsnitch when a remote node starts and stop compacting doesn't seem like it's going to improve on this a whole lot. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183761#comment-13183761 ] Vijay commented on CASSANDRA-3722: -- Hi Brandon, Will it make sense for the remote node to avoid traffic for a given the start and end? yes start of it is going to be an issue but the end of the major event can be at least conservative on not sending data? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183703#comment-13183703 ] Brandon Williams commented on CASSANDRA-3722: - It might be worth experimenting with the update interval and the window size, though. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183692#comment-13183692 ] Brandon Williams commented on CASSANDRA-3722: - The dsnitch doesn't have a phi to adjust. What would be phi in the FD is just the score here, and we don't care what that is; just that we sort by it. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183683#comment-13183683 ] Jonathan Ellis commented on CASSANDRA-3722: --- Can we get a free lunch by adjusting the dsnitch's phi? > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
[ https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183668#comment-13183668 ] Brandon Williams commented on CASSANDRA-3722: - The only way we could do this efficiently is via gossip which will have at best a latency of one second to another (up to) 3 nodes, but asymptotically just one. This is going to easily take far longer to propagate through cluster than it will for the snitch to see the slow read that affects the score enough to lower the remote node's read priority. > Send Hints to Dynamic Snitch when Compaction or repair is going on for a node. > -- > > Key: CASSANDRA-3722 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3722 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 1.1 >Reporter: Vijay >Assignee: Vijay >Priority: Minor > > Currently Dynamic snitch looks at the latency for figuring out which node > will be better serving the requests, this works great but there is a part of > the traffic sent to collect this data... There is also a window when Snitch > doesn't know about some major event which are going to happen on the node > (Node which is going to receive the data request). > It would be great if we can send some sort hints to the Snitch so they can > score based on known events causing higher latencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira