[ 
https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769041#comment-16769041
 ] 

Benedict commented on CASSANDRA-15013:
--------------------------------------

[~jasobrown]: FWIW, I was proposing a _client configurable_ option.  So 
operators shouldn't need to do anything - in fact, perhaps only client 
_authors_ would ever specify this, though some might make this available to the 
developer using their library if their application semantics prefer one or the 
other.

I don't mind which we pick as the default, and don't mind if this has a user 
configurable option, but while tcp back pressure should be the preferred 
mechanism some clients probably don't behave well in the face of it, and for 
these clients specifying OverloadedException behaviour is probably useful.

[~sumanth.pasupuleti]: Also, if you do the work of implementing back pressure, 
I am happy to make the change to monitor bytes instead of queued items.  I 
don't think it should be significantly more challenging, and it would permit us 
to more tightly bound system resource consumption.

> Message Flusher queue can grow unbounded, potentially running JVM out of 
> memory
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15013
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15013
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Messaging/Client
>            Reporter: Sumanth Pasupuleti
>            Assignee: Sumanth Pasupuleti
>            Priority: Major
>             Fix For: 4.0, 3.0.x, 3.11.x
>
>         Attachments: BlockedEpollEventLoopFromHeapDump.png, 
> BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap 
> dump showing each ImmediateFlusher taking upto 600MB.png
>
>
> This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue 
> bounded, since, in the current state, items get added to the queue without 
> any checks on queue size, nor with any checks on netty outbound buffer to 
> check the isWritable state.
> We are seeing this issue hit our production 3.0 clusters quite often.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to