[ 
https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268054#comment-14268054
 ] 

Benedict commented on CASSANDRA-8518:
-------------------------------------

I agree it would be great to explicitly constrain the amount of in-flight data. 
In many ways this is a duplicate of CASSANDRA-5039, although it is subtly 
differently framed. The problem with the extra goal here, though, is predicting 
how much heap will be consumed by a request is very difficult. Ultimately any 
prediction scheme is unlikely to be substantially better than simply tuning the 
number of in-flight requests on the system. We could use Count-Min sketches on 
partitions, say, but this is getting into complex, ugly territory for something 
that hopefully has a short shelf life, and won't likely work well at all for 
non-local reads.

In my opinion the "right" solution is to implement streaming reads, so that we 
can explicitly bound the amount of in-flight data per request. Short of that, 
we could in the mean time impose an explicit _bound_ on memory used per-query, 
and terminate a query we detect to have gone above this bound, so that if you 
tune the number in flight you can guarantee the heap won't absolutely explode. 
We could even have all queries coordinate lightly and if we exceed an 
"in-flight" limit, kill one.. This could also be enforced by MessagingService 
as we read data off the wire, so that it never pollutes the system (other 
schemes would be hard to include MS in, I think). I would rather a "best 
effort" than exact bound, so that we could implement it easily, and I would 
probably stick to the simple per-request limit, as a stop-gap until we get 
streaming reads.


> Cassandra Query Request Size Estimator
> --------------------------------------
>
>                 Key: CASSANDRA-8518
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Cheng Ren
>
> We have been suffering from cassandra node crash due to out of memory for a 
> long time. The heap dump from the recent crash shows there are 22 native 
> transport request threads each of which consumes 3.3% of heap size, taking 
> more than 70% in total.  
> Heap dump:
> !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955&w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600!
> Expanded view of one thread:
> !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955&w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600!
> The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor 
> as the request executor and provided a default request size estimator which 
> constantly returns 1, meaning it limits only the number of requests being 
> pushed to the pool. To have more fine-grained control on handling requests 
> and better protect our node from OOM issue, we propose implementing a more 
> precise estimator. 
> Here is our two cents:
> For update/delete/insert request: Size could be estimated by adding size of 
> all class members together.
> For scan query, the major part of the request is response, which can be 
> estimated from the history data. For example if we receive a scan query on a 
> column family for a certain token range, we keep track of its response size 
> used as the estimated response size for later scan query on the same cf. 
> For future requests on the same cf, response size could be calculated by 
> token range*recorded size/ recorded token range. The request size should be 
> estimated as (query size + estimated response size).
> We believe what we're proposing here can be useful for other people in the 
> Cassandra community as well. Would you mind providing us feedbacks? Please 
> let us know if you have any concerns or suggestions regarding this proposal.
> Thanks,
> Cheng



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to