[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-30 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242828#comment-13242828
 ] 

Vijay commented on CASSANDRA-3722:
--

wfm +1 on both the v5 and nt PROXYHISTOGRAMS 

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1-V2.patch, 
> 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, 
> 0001-CASSANDRA-3723-A2-Patch.patch, 
> 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt, 
> 3722-v5.txt
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-30 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242802#comment-13242802
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

bq. is it because the lack of a datapoint, isn't taken into account as slowness?

Exactly.  It's not receiving new data, so the score doesn't change and the dead 
host is still rated the best until the FD removes it as an option.  Doing it 
this way, time itself penalizes the host when it stops responding.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1-V2.patch, 
> 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, 
> 0001-CASSANDRA-3723-A2-Patch.patch, 
> 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-30 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242800#comment-13242800
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

Why doesn't dsnitch incorporate this automatically?  is it because the lack of 
a datapoint, isn't taken into account as slowness?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1-V2.patch, 
> 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, 
> 0001-CASSANDRA-3723-A2-Patch.patch, 
> 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-30 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242729#comment-13242729
 ] 

Vijay commented on CASSANDRA-3722:
--

Done some testing (single DC) and it works as expected +1 for v4. Thanks!

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1-V2.patch, 
> 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3722-v3.patch, 
> 0001-CASSANDRA-3723-A2-Patch.patch, 
> 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt, 3722-v4.txt
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-28 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240524#comment-13240524
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

A few things while I continue to test:

Compaction can occur before the gossiper is started (and indeed, while gossip 
is deactivated):

{noformat}

ERROR 23:38:09,192 Exception in thread Thread[CompactionExecutor:1,1,main]
java.lang.AssertionError
at 
org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1105)
at 
org.apache.cassandra.service.StorageService.reportSeverity(StorageService.java:790)
at 
org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutor.beginCompaction(CompactionManager.java:1021)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:136)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:128)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

getConnectionManagers is unused, but I imagine that was to help with the 
hashcode business that we can now remove.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1-V2.patch, 
> 0001-CASSANDRA-3722-A1.patch, 0001-CASSANDRA-3723-A2-Patch.patch, 
> 0001-Expose-SP-latencies-in-nodetool-proxyhistograms.txt
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-23 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237102#comment-13237102
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

We should probably avoid having Gossiper inject application states directly, if 
for nothing else than to not make life harder for CASSANDRA-3125

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-CASSANDRA-3722-A1.patch
>
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-03-22 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236306#comment-13236306
 ] 

Vijay commented on CASSANDRA-3722:
--

I was almost complete with the patch but filtering based on pending queue can 
be potentially dangerous.
On a Multi region cluster the pending commands in the local replicas will be 
almost always higher than remote ones because they dont receive reads. We might 
want to filter might not want to filter based on pending.

We can do a hackie solution by padding the score of the remote DC's to be 
higher artificial value than the local DC, What do you guys think?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-02-29 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219639#comment-13219639
 ] 

Pavel Yaskevich commented on CASSANDRA-3722:


Also maintaining a kind of normalized load statistics for each node in the 
combination with pending requests could give a better view of what is going on 
on the node e.g. load is <= 0.5 but we have a big pending queue for/on the node 
- that could mean network failure. For the statistic we can assign each of the 
sub-routines "load impact" factor e.g. compaction 0.3, scrub - 0.2, read - 
0.01, we can set the load threshold for "overloaded" nodes e.g. 0.85 (which 
could be adjusted at runtime) and sort hosts accordingly to their load + 
pending request statistics which would make penalizing hosts more precise. 
Obviously some normalization should be done because clusters won't always have 
nodes with identical processing capabilities (network, hardware etc.).

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-02-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217657#comment-13217657
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

bq. Taking into account the number of outstanding requests is IMO a necessity

It sounds like you're saying that if coordinator X has 1000 requests pending 
response from replica Y, and only 100 to replica Z, then X should suspect that 
Y is having trouble and rely more heavily on Z, even before requests to Y start 
timing out.  Right?

That sounds reasonable to me in theory.  How do we mix that into the existing 
latency information we track for dsnitch?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-02-27 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217445#comment-13217445
 ] 

Peter Schuller commented on CASSANDRA-3722:
---

I'm -0 on the original bit of this ticket, but +1 on more generic changes that 
covers the original use case as good if not better anyway. I think that instead 
of trying to predict exactly the behavior of some particular event like 
compaction, we should just be better at actually responding to what is actually 
going on:

* We have CASSANDRA-2540 which can help avoid blocking uselessly on a dropped 
or slow request even if we haven't had the opportunity to react to overall 
behavior yet (I have a partial patch that breaks read repair, I haven't had 
time to finish it).
* Taking into account the number of outstanding requests is IMO a necessity. 
There is plenty of precedent for anyone who wants that (least used connections 
policies in various LB:s), but more importantly it would so clearly help in 
several situations, including:
** Sudden GC pause of a node
** Sudden death of a node
** Sudden page cache eviction and slowness of a node, before snitching figures 
it out
** Constantly overloaded node; even with the dynsnitch it would improve the 
situation as the number of requests affected by a dynsnitch reset is lessened
** Packet loss/hiccup/whatever across DC:s

There is some potential for foot shooting in the sense that if a node is broken 
in a way that it responds with incorrect data, but responds faster than anyone 
else, it will tend to "swallow" all the traffic. But honestly, that feels like 
a minor concern to me based on what I've seen actually happen in production 
clusters. If we ever start sending non-successes back over inter-node RPC, this 
would change however.

My only major concern is potential performance impacts of keeping track of the 
number of outstanding requests, but if that *does* become a problem one can 
make it probabilistic - have N % of all requests be tracked. Less impact, but 
also less immediate response to what's happening.

This will also have the side-effect of mitigating sudden bursts of promotion 
into old-gen if we combine it with pro-actively dropping read-repair messages 
for nodes that are overloaded (effectively prioritizing data reads), hence 
helping for CASSANDRA-3853.

{code}
Should we T (send additional requests which are not part of the normal 
operations) the requests until the other node recovers?
{code}

In the absence of read repair, we'd have to do speculative reads as Stu has 
previously noted. With read repair turned on, this is not an issue because the 
node will still receive requests and eventually warm up. Only with read repair 
turned off do we not send requests to more than the first N of endpoints, with 
N being what is required by CL.

Semi-relatedly, I think it would be a good idea to make the proximity sorting 
probabilistic in nature so that we don't do a binary flip back and fourth 
between who gets data vs. digest reads or who doesn't get reads at all. That 
might mitigate this problem, but not help fundamentally since the rate of 
warm-up would decrease with a node being slow.

I do want to make this point though: *Every single production cluster* I have 
ever been involved with so far, has been such that you basically never want to 
turn read repair off. Not because of read repair itself, but because of the 
traffic it generates. Having nodes not receive traffic is extremely dangerous 
under most circumstances as it leaves nodes cold, only to suddenly explode and 
cause timeouts and other bad behavior as soon as e.g. some neighbor goes down 
and it suddenly starts taking traffic. This is an easy way to make production 
clusters fall over. If your workload is entirely in memory or otherwise not 
reliant on caching the problem is much less pronounced, but even then I would 
generally recommend that you keep it turned on if only because your nodes will 
have to be able to take the additional load *anyway* if you are to survive 
other nodes in the neighborhood going down. It just makes clusters much more 
easy to reason about.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know ab

[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-02-17 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210704#comment-13210704
 ] 

Vijay commented on CASSANDRA-3722:
--

Another issue just came up which is related to this ticket.
Where a node just completed bootstrap and hence it will not have the files 
cache warm enough for a faster response. Hence the scores for this node will be 
much lower score and hence there is a very less traffic going to that node. The 
problem is that unless there is more traffic sent the file caches will not warm 
up fast enough. Should we T (send additional requests which are not part of the 
normal operations) the requests until the other node recovers?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-17 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188124#comment-13188124
 ] 

Vijay commented on CASSANDRA-3722:
--

How about instead of sending gossip message, in DES.updateScores we can also 
compare the number of waiting tasks to be performed on a given node, this way 
we can actually move on to the next node when there is a lot more pending data 
to be received before receiving the latencies... Makes sense? (Where pending is 
node y pending > (another node x 's pending + %) the prefer the node x)

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185319#comment-13185319
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

bq. Right, so I'm saying instead of gossiping an indirect indicator like 
compaction status, we could gossip read latency directly.

I'm not sure that's a good solution either due to the propagation time.  
Especially in the case that the repair is staggered across the replica set, 
there's a good chance you're penalizing the wrong host after the first one 
until the state is propagated to you.


> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185316#comment-13185316
 ] 

Vijay commented on CASSANDRA-3722:
--

will do.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185217#comment-13185217
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

bq. the problem with that is that until we send the traffic actual traffic to 
the suspected node we will not know it

Right, so I'm saying instead of gossiping an indirect indicator like compaction 
status, we could gossip read latency directly.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185138#comment-13185138
 ] 

Vijay commented on CASSANDRA-3722:
--

Probably after CASSANDRA-3723 we might be able to include the wait time in the 
queue, but the problem with that is that until we send the traffic actual 
traffic to the suspected node we will not know it, we should actually send 
occasional requests to those nodes to get this data (Not sure how we can 
throttle to be fewer requests)... 

Something like Test/Real messages to see if they are back to normal SLA's 
(which does a complete end to end read from the disk), if we dont want to 
publish major events.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185131#comment-13185131
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

bq. you could base it on latency instead of pending count...

I'm not sure I understand, do you mean the latency the pending reads would have 
if they came back at the time of calculating the scores?  I was thinking about 
doing that since we have the timestamps in EM.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-12 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185125#comment-13185125
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

you could base it on latency instead of pending count...

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184576#comment-13184576
 ] 

Vijay commented on CASSANDRA-3722:
--

but the pending will be almost zero for the bad nodes as we already redirected 
traffic, right?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184527#comment-13184527
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

The tricky thing is, we don't know how much to penalize them per pending read.  
We could choose an arbitrary static amount, and that would work as long as it's 
the same for all hosts... except the badness threshold loses meaning.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184516#comment-13184516
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

bq. maybe penalizing hosts with pending reads would work better since it would 
work universally

I like that idea.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184511#comment-13184511
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

Hmm, I see.  Instead of communicating various states the node is in over 
gossip, maybe penalizing hosts with pending reads would work better since it 
would work universally.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184497#comment-13184497
 ] 

Vijay commented on CASSANDRA-3722:
--

Hi Brandon, i am talking about Reads when the RR is disabled (0.0) It is 
not going to be 1 or 2 reads (we would be really happy if it is only 1 to 2 
reads) it will be as many reads it takes for the first read to come back, for 
example we have 1000 request per second work load when we see the node goes 
slower it might take a second or timeout (10 second) which means 1000 to 10k 
requests.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184475#comment-13184475
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

bq. If we have the end time we can avoid sending traffic back to the node until 
it recovers completely right?

I'm not sure if you mean writes or checksum requests for read repair here, but 
neither is avoidable (except for RR by adjusting the probability setting.)

If you mean the one or two reads after the reset interval, that doesn't seem 
like it's going to have a big enough impact to optimize for.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184427#comment-13184427
 ] 

Vijay commented on CASSANDRA-3722:
--




If we have thee end time we can avoid sending traffic back to the node untill 
it recovers completely right? If compaction are for a second Is the worst case 
is till ring delay we will not send traffic... alternatively we could do the 
difference from the response time avg and throttle the request... or even T the 
request just to conform if the even is complete Makes sense? 


> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-11 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184289#comment-13184289
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

bq. Hi Brandon, Will it make sense for the remote node to avoid traffic for a 
given the start and end?

Sure, but my point is that's going to happen already:

* node X begins a repair, adversely affecting its read latency
* node Y tries to read data from X, and scores it badly due to the extra latency

At this point, node Y will not try to read from X again until either:

* all other members of X's replica set perform *worse* than X
* the RESET_INTERVAL_IN_MS elapses, and the entire process starts over again

Eventually, node X finishes the validation compaction and when the reset 
interval elapses, its reads are more equal to the rest of the replica set and 
everything is back to normal.

Telling the dsnitch when a remote node starts and stop compacting doesn't seem 
like it's going to improve on this a whole lot.


> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-10 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183761#comment-13183761
 ] 

Vijay commented on CASSANDRA-3722:
--

Hi Brandon, Will it make sense for the remote node to avoid traffic for a given 
the start and end? yes start of it is going to be an issue but the end of the 
major event can be at least conservative on not sending data? 

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-10 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183703#comment-13183703
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

It might be worth experimenting with the update interval and the window size, 
though.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-10 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183692#comment-13183692
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

The dsnitch doesn't have a phi to adjust.  What would be phi in the FD is just 
the score here, and we don't care what that is; just that we sort by it.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-10 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183683#comment-13183683
 ] 

Jonathan Ellis commented on CASSANDRA-3722:
---

Can we get a free lunch by adjusting the dsnitch's phi?

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3722) Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.

2012-01-10 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183668#comment-13183668
 ] 

Brandon Williams commented on CASSANDRA-3722:
-

The only way we could do this efficiently is via gossip which will have at best 
a latency of one second to another (up to) 3 nodes, but asymptotically just 
one.  This is going to easily take far longer to propagate through cluster than 
it will for the snitch to see the slow read that affects the score enough to 
lower the remote node's read priority.

> Send Hints to Dynamic Snitch when Compaction or repair is going on for a node.
> --
>
> Key: CASSANDRA-3722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3722
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
>
> Currently Dynamic snitch looks at the latency for figuring out which node 
> will be better serving the requests, this works great but there is a part of 
> the traffic sent to collect this data... There is also a window when Snitch 
> doesn't know about some major event which are going to happen on the node 
> (Node which is going to receive the data request).
> It would be great if we can send some sort hints to the Snitch so they can 
> score based on known events causing higher latencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira