[jira] [Comment Edited] (CASSANDRA-14499) node-level disk quota

2018-06-09 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506971#comment-16506971
 ] 

Jeremiah Jordan edited comment on CASSANDRA-14499 at 6/9/18 1:03 PM:
-

If one node has reached “full” how likely is it that others are about to as 
well?  Without monitoring how will an operator know to do something to “fix” 
the situation? I’m just not convinced that it’s worth adding the logic and 
complications in the rest of the code to allow this feature, which will maybe 
add a short bandaid of time before things completely fall over, and possible 
just have things fall over early. If you are lucky and one node has enough more 
data than others that it hits this first, without others following shortly 
behind, you might give a small amount of breathing room for compaction to clean 
a little space out, but that is only going to do so much, it won’t fix the 
problem. You need to recognize as an operator that your nodes are full and add 
more nodes to your cluster, or add more disk space to your cluster.


was (Author: jjordan):
If one node has reached “full” how likely is it that others are about to as 
well?  Without monitoring how will an operator know to do something to “fix” 
the situation? I’m just not convinced that it’s worth adding the logic and 
complications in the rest of the code to allow this feature, which will maybe 
add a short bandaid of time before things completely fall over, and possible 
just have things fall over early. If you are lucky and one node has enough more 
data than other that it hits this first, without other following shortly 
behind, you might give a small amount of breathing room for compaction to clean 
a little space out, but that is only going to do so much, it won’t fix the 
problem. You need to recognize as an operator that your nodes are full and add 
more nodes to your cluster, or add more disk space to your cluster.

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-09 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506971#comment-16506971
 ] 

Jeremiah Jordan commented on CASSANDRA-14499:
-

If one node has reached “full” how likely is it that others are about to as 
well?  Without monitoring how will an operator know to do something to “fix” 
the situation? I’m just not convinced that it’s worth adding the logic and 
complications in the rest of the code to allow this feature, which will maybe 
add a short bandaid of time before things completely fall over, and possible 
just have things fall over early. If you are lucky and one node has enough more 
data than other that it hits this first, without other following shortly 
behind, you might give a small amount of breathing room for compaction to clean 
a little space out, but that is only going to do so much, it won’t fix the 
problem. You need to recognize as an operator that your nodes are full and add 
more nodes to your cluster, or add more disk space to your cluster.

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14507) OutboundMessagingConnection backlog is not fully written in case of race conditions

2018-06-09 Thread Sergio Bossa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506886#comment-16506886
 ] 

Sergio Bossa commented on CASSANDRA-14507:
--

bq.  Wouldn't they be picked up by the 
MessageOutHandler::channelWritabilityChanged and then get drained?

That is assuming the writability changes before the timeout window, which might 
very well not be the case, unless I'm misunderstanding your question? Also, 
{{channelWritabilityChanged()}} is race-prone by itself as mentioned above.

> OutboundMessagingConnection backlog is not fully written in case of race 
> conditions
> ---
>
> Key: CASSANDRA-14507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Priority: Major
>
> The {{OutboundMessagingConnection}} writes into a backlog queue before the 
> connection handshake is successfully completed, and then writes such backlog 
> to the channel as soon as the successful handshake moves the channel state to 
> {{READY}}.
> This is unfortunately race prone, as the following could happen:
> 1) One or more writer threads see the channel state as {{NOT_READY}} in 
> {{#sendMessage()}} and are about to enqueue to the backlog, but they get 
> descheduled by the OS.
> 2) The handshake thread is scheduled by the OS and moves the channel state to 
> {{READY}}, emptying the backlog.
> 3) The writer threads are scheduled back and add to the backlog, but the 
> channel state is {{READY}} at this point, so those writes would sit in the 
> backlog and expire.
> Please note a similar race condition exists between 
> {{OutboundMessagingConnection#sendMessage()}} and 
> {{MessageOutHandler#channelWritabilityChanged()}}, which is way more serious 
> as the channel writability could frequently change, luckily it looks like 
> {{ChannelWriter#write()}} never gets invoked with {{checkWritability}} at 
> {{true}} (so writes never go to the backlog when the channel is not writable).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org