Got it, thanks! Yifeng
On Fri, Feb 14, 2020 at 11:29 AM Flavio Leitner <f...@sysclose.org> wrote: > > On Fri, Feb 14, 2020 at 09:44:52AM -0800, Yifeng Sun wrote: > > Hi Flavio, > > > > Can you please confirm the kernel versions you are using? > > > > Host KVM: 5.2.14-200.fc30.x86_64. > > Host KVM: 5.5.0+ > > > VM: 4.15.0 from upstream ubuntu. > > VM: 4.15.0 from Linus git tree. > > fbl > > > > > Thanks, > > Yifeng > > > > On Thu, Feb 13, 2020 at 12:05 PM Flavio Leitner <f...@sysclose.org> wrote: > > > > > > > > > Hi Yifeng, > > > > > > Sorry the late response. > > > > > > On Wed, Jan 29, 2020 at 09:04:39AM -0800, Yifeng Sun wrote: > > > > Hi Flavio, > > > > > > > > Sorry in my last email, one change is incorrect. it should be: > > > > in tcp_v4_rcv() > > > > - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo)) > > > > + if (0) > > > > > > > > The kernel version I am using is ubuntu 18.04's default kernel: > > > > $ uname -r > > > > 4.15.0-76-generic > > > > > > I deployed a VM with 4.15.0 from upstream and I can ssh, scp (back > > > and forth), iperf3 (direct, reverse, with TCP or UDP) between that > > > VM and another VM, veth, bridge and another host without issues. > > > > > > Any chance for you to try with the same upstream kernel version? > > > > > > Thanks, > > > fbl > > > > > > > > > > > Thanks, > > > > Yifeng > > > > > > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner <f...@sysclose.org> > > > > wrote: > > > > > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > > > Sure. > > > > > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > > > userspace-tso-enable="true"} > > > > > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM > > > > > > host: > > > > > > <interface type='vhostuser'> > > > > > > <mac address='88:69:00:00:00:11'/> > > > > > > <source type='unix' > > > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > > <model type='virtio'/> > > > > > > <driver queues='2' rx_queue_size='512'> > > > > > > <host csum='on' tso4='on' tso6='on'/> > > > > > > <guest csum='on' tso4='on' tso6='on'/> > > > > > > > > > > I have other options set, but I don't think they are related: > > > > > <host csum='on' gso='off' tso4='on' tso6='on' ecn='off' > > > > > ufo='off' mrg_rxbuf='on'/> > > > > > <guest csum='on' tso4='on' tso6='on' ecn='off' ufo='off'/> > > > > > > > > > > > > > > > > </driver> > > > > > > <alias name='net2'/> > > > > > > <address type='pci' domain='0x0000' bus='0x00' slot='0x06' > > > > > > function='0x0'/> > > > > > > </interface> > > > > > > > > > > > > When VM boots up, turn on tx, tso and sg > > > > > > # ethtool -K ens6 tx on > > > > > > # ethtool -K ens6 tso on > > > > > > # ethtool -K ens6 sg on > > > > > > > > > > All the needed offloading features are turned on by default, > > > > > so I don't change anything in my testbed. > > > > > > > > > > > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another > > > > > > VM. > > > > > > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` > > > > > > shows > > > > > > that iperf server received packets with invalid TCP checksum. > > > > > > `nstat -a` shows that TcpInCsumErr number is accumulating. > > > > > > > > > > > > After adding changes to VM's kernel as below, iperf works properly. > > > > > > in tcp_v4_rcv() > > > > > > - if (skb_checksum_init(skb, IPPROTO_TCP, > > > > > > inet_compute_pseudo)) > > > > > > + if (skb_checksum_init(skb, IPPROTO_TCP, > > > > > > inet_compute_pseudo)) > > > > > > > > > > > > static inline bool tcp_checksum_complete(struct sk_buff *skb) > > > > > > { > > > > > > return 0; > > > > > > } > > > > > > > > > > That's odd. Which kernel is that? Maybe I can try the same version. > > > > > I am using 5.2.14-200.fc30.x86_64. > > > > > > > > > > Looks like somehow the packet lost its offloading flags, then kernel > > > > > has to check the csum and since it wasn't calculated before, it's > > > > > just random garbage. > > > > > > > > > > fbl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > Yifeng > > > > > > > > > > > > On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner <f...@sysclose.org> > > > > > > wrote: > > > > > > > > > > > > > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote: > > > > > > > > Hi Flavio, > > > > > > > > > > > > > > > > Thanks for the explanation. I followed the steps in the > > > > > > > > document but > > > > > > > > TCP connection still failed to build between 2 VMs. > > > > > > > > > > > > > > > > I finally modified VM's kernel directly to disable TCP checksum > > > > > > > > validation > > > > > > > > to get it working properly. I got 30.0Gbps for 'iperf' between > > > > > > > > 2 VMs. > > > > > > > > > > > > > > Could you provide more details on how you did that? What's running > > > > > > > inside the VM? > > > > > > > > > > > > > > I don't change anything inside of the VMs (Linux) in my testbed. > > > > > > > > > > > > > > fbl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > Yifeng > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner > > > > > > > > <f...@sysclose.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote: > > > > > > > > > > Hi Ilya, > > > > > > > > > > > > > > > > > > > > Thanks for your reply. > > > > > > > > > > > > > > > > > > > > The thing is, if checksum offloading is enabled in both > > > > > > > > > > VMs, then > > > > > > > > > > sender VM will send > > > > > > > > > > a packet with invalid TCP checksum, and later OVS will send > > > > > > > > > > this > > > > > > > > > > packet to receiver > > > > > > > > > > VM directly without calculating a valid checksum. As a > > > > > > > > > > result, > > > > > > > > > > receiver VM will drop > > > > > > > > > > this packet because it contains invalid checksum. This is > > > > > > > > > > what > > > > > > > > > > happened when I tried > > > > > > > > > > this patch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > When TSO is enabled, the TX checksumming offloading is > > > > > > > > > required, > > > > > > > > > then you will see invalid checksum. This is well documented > > > > > > > > > here: > > > > > > > > > > > > > > > > > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso > > > > > > > > > > > > > > > > > > "Additionally, if the traffic is headed to a VM within the > > > > > > > > > same host > > > > > > > > > further optimization can be expected. As the traffic never > > > > > > > > > leaves > > > > > > > > > the machine, no MTU needs to be accounted for, and thus no > > > > > > > > > segmentation and checksum calculations are required, which > > > > > > > > > saves yet > > > > > > > > > more cycles." > > > > > > > > > > > > > > > > > > Therefore, it's expected to see bad csum in the traffic dumps. > > > > > > > > > > > > > > > > > > To use the feature, you need few steps: enable the feature in > > > > > > > > > OvS > > > > > > > > > enable in qemu and inside the VM. The linux guest usually > > > > > > > > > enable > > > > > > > > > the feature by default if qemu offers it. > > > > > > > > > > > > > > > > > > HTH, > > > > > > > > > fbl > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Yifeng > > > > > > > > > > > > > > > > > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets > > > > > > > > > > <i.maxim...@ovn.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On 27.01.2020 18:24, Yifeng Sun wrote: > > > > > > > > > > > > Hi Flavio, > > > > > > > > > > > > > > > > > > > > > > > > I am testing your patch using iperf between 2 VMs on > > > > > > > > > > > > the same host. > > > > > > > > > > > > But it seems that TCP connection can't be created > > > > > > > > > > > > between these 2 VMs. > > > > > > > > > > > > When inspecting further, I found that TCP packets have > > > > > > > > > > > > invalid checksums. > > > > > > > > > > > > This might be the reason. > > > > > > > > > > > > > > > > > > > > > > > > I am wondering if I missed something in the setup? > > > > > > > > > > > > Thanks a lot. > > > > > > > > > > > > > > > > > > > > > > I didn't test myself, but according to current design, > > > > > > > > > > > checksum offloading > > > > > > > > > > > (rx and tx) shuld be enabled in both VMs. Otherwise all > > > > > > > > > > > the packets will > > > > > > > > > > > be dropped by the guest kernel. > > > > > > > > > > > > > > > > > > > > > > Best regards, Ilya Maximets. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > fbl > > > > > > > > > > > > > > -- > > > > > > > fbl > > > > > > > > > > -- > > > > > fbl > > > > > > -- > > > fbl > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > -- > fbl _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev