Hi all,

We are running VMs on OpenStack with OVN. We have an issue with performance, tested using iperf3 with TCP. We get like ~300Kbits in the problematic scenario, when normal traffic is around 1Gbps. Some observations are:

* Only TCP is affected, not UDP iperf3 tests
* It only happens between some nodes, not others
* In only happens in one direction in some cases

We've looked into this and found that the poor performance may be due to retransmission / congestion. Looking deeper, there seems to be some interesting behaviour with fragmentation? / reassembly

Our architecture is like this:

VM23 - [TAP23 - BOND23] -- (internet) -- [BOND21 - TAP21] - VM21

VMs are on hypervisors, on the hypervisor the tap devices egress out bonds. The

We have done tcpdumps from VM23 , VM21, BOND21 and TAP21.

What we have found is that a PSH,ACK packet from VM23 is re-written into a ACK packet when it gets to the bond.

When it gets to the other side, these packets doesn't seem to be reassembled properly to be passed onto the tap into VM21.

We would like to know if this behaviour (rewriting a PSH,ACK into separate ACK packet) is a normal behaviour of OVS/OVN? Is there any other reason why there are so many retransmissions?

I'm not sure if this is an OVN or OVS issue, apologies if this is not the right list. I'm also not sure if I'm debugging this issue correctly. Any help will be welcome!

Regards,
Jake

--
I've attached the tcpdumps at https://swift.rc.nectar.org.au/v1/AUTH_f42f9588576c43969760d81384b83b1f/jake-temp/ if anyone is interested

To line up the pcaps, do a filter `tcp.srcport == 59320`

The "interesting" packet is:
packet 23 on vm23.pcap    PSH, ACK (976 bytes) which turned into
packet 32 on bond21.pcap  ACK, (488 bytes)
which did not make it onto the tap device at all?
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to