[ https://issues.apache.org/jira/browse/ARTEMIS-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171422#comment-17171422 ]
Francesco Nigro edited comment on ARTEMIS-2852 at 8/5/20, 11:52 AM: -------------------------------------------------------------------- [~kkondzielski] Yep, that seems more correct: - the 3 live-backup pairs are there to save split brain to happen - having a symmetric cluster should be ok to get proper redistribution of messages (and achieve server side load balancing) - ideally a round robin client side load balancing would help clients to get proper load balancing for each client node I've still another question (and more later, just need to think more about it): - what's the exact meaning of threads/sender nodes/receiver nodes in term of number of connections/sessions etc etc? And I've a proposal: to understand the scaling capability, even using a single broker (that as been done until now) is important to understand how the scalability of the whole system behave IMO. Would be nice to have 3 baseline: 1) single broker (without HA) 2) single HA pair (1 live- 1 backup) 3) 3 HA pairs to achieve HA with no split brain, but NO load balancing, just 1 live processing messages Considering that clients (code and machines) basically doesn't do any processing with the messages I'm not quite sure that load balancing is needed here at all, hence I'm not 100% sure adding redistribution on the other 2 nodes is meaningful... Probably [~jbertram] has some thoughts on this, given that was going to add an OFF_WITH_REDISTRIBUTION option recently... was (Author: nigro....@gmail.com): [~kkondzielski] Yep, that seems more correct: - the 3 live-backup pairs are there to save split brain to happen - having a symmetric cluster should be ok to get proper redistribution of messages (and achieve server side load balancing) - ideally a round robin client side load balancing would help clients to get proper load balancing for each client node I've still another question (and more later, just need to think more about it): - what's the exact meaning of threads/sender nodes/receiver nodes in term of number of connections/sessions etc etc? And I've a proposal: to understand the scaling capability, even using a single broker (that as been done until now) is important to understand how the scalability of the whole system behave IMO. Would be nice to have 2 baseline: 1) single broker (without HA) 2) single HA pair (1 live- 1 backup) And later adding the other 2 live-backup pairs to check how the number changes. wdyt? > Huge performance decrease between versions 2.2.0 and 2.13.0 > ----------------------------------------------------------- > > Key: ARTEMIS-2852 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2852 > Project: ActiveMQ Artemis > Issue Type: Bug > Reporter: Kasper Kondzielski > Priority: Major > Attachments: Selection_433.png, Selection_434.png, Selection_440.png, > Selection_441.png, Selection_451.png > > > Hi, > Recently, we started to prepare a new revision of our blog-post in which we > test various implementations of replicated queues. Previous version can be > found here: [https://softwaremill.com/mqperf/] > We updated artemis binary to 2.13.0, regenerated configuration file and > applied all the performance tricks you told us last time. In particular these > were: > * the {{Xmx}} java parameter bumped to {{16G (now bumped to 48G)}} > * in {{broker.xml}}, the {{global-max-size}} setting changed to {{8G (this > one we forgot to set, but we suspect that it is not the issue)}} > * {{journal-type}} set to {{MAPPED}} > * {{journal-datasync}}, {{journal-sync-non-transactional}} and > {{journal-sync-transactional}} all set to false > Apart from that we changed machines' type we use to r5.2xlarge ( 8 cores, 64 > GIB memory, Network bandwidth Up to 10 Gbps, Storage bandwidth Up to 4,750 > Mbps) and we decided to always run twice as much receivers as senders. > From our tests it looks like version 2.13.0 is not scaling as well, with the > increase of senders and receivers, as version 2.2.0 (previously tested). > Basically is not scaling at all as the throughput stays almost at the same > level, while previously it used to grow linearly. > Here you can find our tests results for both versions: > [https://docs.google.com/spreadsheets/d/1kr9fzSNLD8bOhMkP7K_4axBQiKel1aJtpxsBCOy9ugU/edit?usp=sharing] > We are aware that now there is a dedicated page in documentation about > performance tuning, but we are surprised that same settings as before > performs much worse. > Maybe there is an obvious property which we overlooked which should be turned > on? > All changes between those versions together with the final configuration can > be found on this merged PR: > [https://github.com/softwaremill/mqperf/commit/6bfae489e11a250dc9e6ef59719782f839e8874a] > > Charts showing machines' usage in attachments. Memory consumed by artemis > process didn't exceed ~ 16 GB. Bandwidht and cpu weren't also a bottlenecks. > p.s. I wanted to ask this question on mailing list/nabble forum first but it > seems that I don't have permissions to do so even though I registered & > subscribed. Is that intentional? > -- This message was sent by Atlassian Jira (v8.3.4#803005)