[ 
https://issues.apache.org/jira/browse/CASSANDRA-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608262#comment-14608262
 ] 

Mark Curtis commented on CASSANDRA-9509:
----------------------------------------

+1 to this idea, adaptively setting the stream throughput based on the type of 
operation that is being done on the node. If a node is being decommissioned for 
example you may want to up the throughput so the node would stream out its data 
quickly. The setting would be lowered if the node was streaming data in for 
example when bootstrapping. IMO I think this would be very useful especially to 
new users of Cassandra since it is not always obvious when a operation is 
taking a long time and having the stream adapt its setting to a rudimentary 
setting based upon the operation would be a good idea.

> Streams throughput control
> --------------------------
>
>                 Key: CASSANDRA-9509
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9509
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Config
>            Reporter: Alain RODRIGUEZ
>            Priority: Minor
>
> Currently, I have to keep tuning stream throughput all the time manually 
> (through nodetool setstreamthroughput) since the same value stands for 
> example for a decommission or a removenode (for exemple). The point is in 
> first case Network goes from 1 --> N nodes (and is obviously limited by the 
> node sending), in the second it is a N --> N nodes (N being number of 
> remaining nodes). Removing node, throughput limit will not be reached in most 
> cases, and all the nodes will be under heavy load. So with the same value of 
> stream throughput, we send N times faster on a removenode than using 
> decommission. 
> An other exemple is repair is also faster as  more nodes start repairing (we 
> have 20 nodes, taking 2+ days to repair data, and repair have to run within 
> 10 days, can't be one at the time, and stream throughput needs to be adjusted 
> accordingly.
> Is there a way to:
> - limit incoming network on a node ?
> - limit cluster wide sent network ?
> - make streaming processes background task (using remaining resources) ? This 
> looks harder to me since the bottleneck depends on the node hardware and the 
> workload. It can be either the CPU, the network, the disk throughput or even 
> the memory...  
> If none of those ideas are doable, can we imagine to dissociate stream 
> throughputs depending on the operation, to configure them individually in 
> cassandra.yaml ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to