[ https://issues.apache.org/jira/browse/CASSANDRA-15061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806286#comment-16806286 ]
Stefan Miklosovic commented on CASSANDRA-15061: ----------------------------------------------- FYI there is an option, used in CCM (and in turn in CircleCI too) that overrides default values of MAX_HEAP_SIZE here (1). I was not running dtests with these properties initially so by default it is 512M as I was not overriding anything myself. I have used 1G and tests started to look better but some of them were still failing, after increasing to 4G it seemed to resolve the issue. (1) [https://github.com/apache/cassandra/blob/a85196246c009566aa838156f3f56d00145e76ad/.circleci/config.yml#L84-L85] > Dtests: tests are failing on too powerful machines, setting more memory per > node in dtests > ------------------------------------------------------------------------------------------ > > Key: CASSANDRA-15061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15061 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest > Reporter: Stefan Miklosovic > Priority: Normal > Labels: CI, dtest, test > > While running dtests on 32 cores and 64 GB of memory on c5.9xlarge some tests > are failing because they are not able to handle the stress cassandra-stress > is generating for them. > For all examples, there is e.g. this one (1) where we test that a cluster is > able to cope with a boostrapping node. The problem is that node1 is bombed > with cassandra-stress and it is eventually killed and test fails as such > before even proceeding to test itself. > It was said to me that dtests in circleci are running in containers with 8 > cores and 16GB or RAM and I simulated this on my machine > (-Dcassandra.available_processors=8). The core problem is that nodes do not > have enough memory - Xmx and Xms is set to only 512MB and that is very low > figure and they are eventually killed. > Proposed solutions: > 1) Run dtests on less powerful machines so it can not handle stress high > enough so underlying nodes would be killed (rather strange idea) > 2) Increase memory for node - this should be configurable, I saw that 1GB > helps but there are still some timeouts, 2GB is better. 4GB would be the best. > 3) Fix the test in such way it does not fail with 512MB. > > 1) is not viable to me, 3) takes a lot of time to go through and does not > actually solve anything and it would be very cumbersome and clunky to go > through all tests to set them like that. 2) seems to be the best approach but > there is not any way I am aware of how to add more memory to every node all > at once as node and cluster start / creation is scattered all over the > project. > I have raised the issue here (2) too. > Do you guys think that if we manage to somehow fix this in CCM, we could > introduce some switch / flag to dtests as how much memory a node in a cluster > should run with? > (1) > [https://github.com/apache/cassandra-dtest/blob/master/bootstrap_test.py#L419-L470] > (2) [https://github.com/riptano/ccm/issues/696] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org