[ https://issues.apache.org/jira/browse/CASSANDRA-11327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328789#comment-15328789 ]
Ariel Weisberg commented on CASSANDRA-11327: -------------------------------------------- ||Code|utests|dtests|| |[3.0 code|https://github.com/apache/cassandra/compare/cassandra-3.0...aweisberg:CASSANDRA-11327-3.0?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-3.0-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-3.0-dtest/]| |[trunk code|https://github.com/apache/cassandra/compare/trunk...aweisberg:CASSANDRA-11327-trunk?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-trunk-dtest/]| > Maintain a histogram of times when writes are blocked due to no available > memory > -------------------------------------------------------------------------------- > > Key: CASSANDRA-11327 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11327 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Ariel Weisberg > Assignee: Ariel Weisberg > > I have a theory that part of the reason C* is so sensitive to timeouts during > saturating write load is that throughput is basically a sawtooth with valleys > at zero. This is something I have observed and it gets worse as you add 2i to > a table or do anything that decreases the throughput of flushing. > I think the fix for this is to incrementally release memory pinned by > memtables and 2i during flushing instead of releasing it all at once. I know > that's not really possible, but we can fake it with memory accounting that > tracks how close to completion flushing is and releases permits for > additional memory. This will lead to a bit of a sawtooth in real memory > usage, but we can account for that so the peak footprint is the same. > I think the end result of this change will be a sawtooth, but the valley of > the sawtooth will not be zero it will be the rate at which flushing > progresses. Optimizing the rate at which flushing progresses and it's > fairness with other work can then be tackled separately. > Before we do this I think we should demonstrate that pinned memory due to > flushing is actually the issue by getting better visibility into the > distribution of instances of not having any memory by maintaining a histogram > of spans of time where no memory is available and a thread is blocked. > [MemtableAllocatr$SubPool.allocate(long)|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/utils/memory/MemtableAllocator.java#L186] > should be a relatively straightforward entry point for this. The first > thread to block can mark the start of memory starvation and the last thread > out can mark the end. Have a periodic task that tracks the amount of time > spent blocked per interval of time and if it is greater than some threshold > log with more details, possibly at debug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)