[jira] [Commented] (CASSANDRA-15061) Dtests: tests are failing on too powerful machines, setting more memory per node in dtests

Stefan Miklosovic (JIRA) Sun, 31 Mar 2019 15:47:00 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806286#comment-16806286
 ]


Stefan Miklosovic commented on CASSANDRA-15061:
-----------------------------------------------

FYI there is an option, used in CCM (and in turn in CircleCI too) that 
overrides default values of MAX_HEAP_SIZE here (1). I was not running dtests 
with these properties initially so by default it is 512M as I was not 
overriding anything myself.

I have used 1G and tests started to look better but some of them were still 
failing, after increasing to 4G it seemed to resolve the issue.

(1) 
[https://github.com/apache/cassandra/blob/a85196246c009566aa838156f3f56d00145e76ad/.circleci/config.yml#L84-L85]

> Dtests: tests are failing on too powerful machines, setting more memory per 
> node in dtests
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15061
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15061
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Test/dtest
>            Reporter: Stefan Miklosovic
>            Priority: Normal
>              Labels: CI, dtest, test
>
> While running dtests on 32 cores and 64 GB of memory on c5.9xlarge some tests 
> are failing because they are not able to handle the stress cassandra-stress 
> is generating for them.
> For all examples, there is e.g. this one (1) where we test that a cluster is 
> able to cope with a boostrapping node. The problem is that node1 is bombed 
> with cassandra-stress and it is eventually killed and test fails as such 
> before even proceeding to test itself.
> It was said to me that dtests in circleci are running in containers with 8 
> cores and 16GB or RAM and I simulated this on my machine 
> (-Dcassandra.available_processors=8). The core problem is that nodes do not 
> have enough memory - Xmx and Xms is set to only 512MB and that is very low 
> figure and they are eventually killed.
> Proposed solutions:
> 1) Run dtests on less powerful machines so it can not handle stress high 
> enough so underlying nodes would be killed (rather strange idea)
> 2) Increase memory for node - this should be configurable, I saw that 1GB 
> helps but there are still some timeouts, 2GB is better. 4GB would be the best.
> 3) Fix the test in such way it does not fail with 512MB.
>  
> 1) is not viable to me, 3) takes a lot of time to go through and does not 
> actually solve anything and it would be very cumbersome and clunky to go 
> through all tests to set them like that. 2) seems to be the best approach but 
> there is not any way I am aware of how to add more memory to every node all 
> at once as node and cluster start / creation is scattered all over the 
> project.
> I have raised the issue here (2) too.
> Do you guys think that if we manage to somehow fix this in CCM, we could 
> introduce some switch / flag to dtests as how much memory a node in a cluster 
> should run with?
> (1) 
> [https://github.com/apache/cassandra-dtest/blob/master/bootstrap_test.py#L419-L470]
> (2) [https://github.com/riptano/ccm/issues/696]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15061) Dtests: tests are failing on too powerful machines, setting more memory per node in dtests

Reply via email to