Alexander Belyak created IGNITE-21039: -----------------------------------------
Summary: Network performance optimization Key: IGNITE-21039 URL: https://issues.apache.org/jira/browse/IGNITE-21039 Project: Ignite Issue Type: Improvement Components: networking Affects Versions: 3.0 Reporter: Alexander Belyak I've run several test to find out the MessagingService performance metrics and that is what I've found: {noformat} TestBoolaMessage 139MB/sec WARD TestByteaMessage 132MB/sec WARD TestDoubleaMessage 102MB/sec WARD TestFloataMessage 132MB/sec WARD TestDoubleaMessage 130MB/sec WARD TestLongaMessage 131MB/sec WARD TestDoubleaMessage 131MB/sec WARD TestStringaMessage 280MB/sec WARD TestBoolMessage 11MB/sec WARD WARD TestByteMessage 12MB/sec WARD TestDoubleMessage 12MB/sec WARD TestFloatMessage 13MB/sec WARD TestIntMessage 12MB/sec WARD TestLongMessage 11MB/sec WARD TestShortMessage 12MB/sec WARD TestStringMessage 18MB/sec WARD TestBool20Message 15MB/sec WARD TestByte20Message 12MB/sec WARD TestDouble20Message 32MB/sec WARD TestFloat20Message 22MB/sec WARD TestInt20Message 13MB/sec WARD TestLong20Message 14MB/sec WARD TestShort20Message 14MB/sec WARD TestString20Message 65MB/sec WARD {noformat} All messages were sent in the same setup: 2 server nodes, connected with a *10GBit* interface. *Iperf3* (iperf3 --time 30 --zerocopy --client 192.168.1.126 --omit 3 --interval 1 --length 16384 --window 131072 --parallel 2 --json --version4) shows about *850MB/sec* network throughput. But the *best AI3* result was only {*}280MB/sec{*}. Upper results use 3 type of messages: 1. {*}Test<Type>aMessage{*}: array of 163840 elements (primitive, except String) of type <Type>. 2. {*}Test<Type>Message{*}: single property (primitive, except String) of type <Type> 3. {*}Test<Type>20Message{*}: 20 property (primitive, except String) of type <Type> All the messages were sent in parallel from the single thread with the window of 100 messages (right after getting another first ack - the new message were sent). It was expected, that network utilization low for the very short messages (like 1 int or 20 int fields), but in comparison with the iperf3 results, the performance of MessagingService for 163KBytes messages was very low. It became significantly better only while sending huge array of strings (same string "{color:#067d17}Test string to check message service performance.{color}"). I've run another butch of tests with 1KB byte[] property in the message in 1 and 8 threads and without send window at all (each thread sends next message after getting the ack for the previous one): ** 1 thread* and got *37 MBytes/sec* *** *8 threads* and got *63 MBytes/sec* result. So I suppose there is pretty much contention. All messages were sent in the followin manner: {code:java} private void send(ClusterNode target, NetworkMessage msg) { messagingService.send(target, msg).handle((v, t) -> { if (t != null) { LOG.info("Error while sending huge message", t); } if (time() < timeout) { send(target, msg); } }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)