Got it, thanks!

Yifeng

On Fri, Feb 14, 2020 at 11:29 AM Flavio Leitner <f...@sysclose.org> wrote:
>
> On Fri, Feb 14, 2020 at 09:44:52AM -0800, Yifeng Sun wrote:
> > Hi Flavio,
> >
> > Can you please confirm the kernel versions you are using?
> >
> > Host KVM: 5.2.14-200.fc30.x86_64.
>
> Host KVM: 5.5.0+
>
> > VM: 4.15.0 from upstream ubuntu.
>
> VM: 4.15.0 from Linus git tree.
>
> fbl
>
> >
> > Thanks,
> > Yifeng
> >
> > On Thu, Feb 13, 2020 at 12:05 PM Flavio Leitner <f...@sysclose.org> wrote:
> > >
> > >
> > > Hi Yifeng,
> > >
> > > Sorry the late response.
> > >
> > > On Wed, Jan 29, 2020 at 09:04:39AM -0800, Yifeng Sun wrote:
> > > > Hi Flavio,
> > > >
> > > > Sorry in my last email, one change is incorrect. it should be:
> > > > in tcp_v4_rcv()
> > > >       - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
> > > >       + if (0)
> > > >
> > > > The kernel version I am using is ubuntu 18.04's default kernel:
> > > > $ uname -r
> > > > 4.15.0-76-generic
> > >
> > > I deployed a VM with 4.15.0 from upstream and I can ssh, scp (back
> > > and forth), iperf3 (direct, reverse, with TCP or UDP) between that
> > > VM and another VM, veth, bridge and another host without issues.
> > >
> > > Any chance for you to try with the same upstream kernel version?
> > >
> > > Thanks,
> > > fbl
> > >
> > > >
> > > > Thanks,
> > > > Yifeng
> > > >
> > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner <f...@sysclose.org> 
> > > > wrote:
> > > > >
> > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > > > > > Sure.
> > > > > >
> > > > > > Firstly, make sure userspace-tso-enable is true
> > > > > > # ovs-vsctl get Open_vSwitch . other_config
> > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > > > > > userspace-tso-enable="true"}
> > > > > >
> > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM 
> > > > > > host:
> > > > > >     <interface type='vhostuser'>
> > > > > >       <mac address='88:69:00:00:00:11'/>
> > > > > >       <source type='unix'
> > > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> > > > > >       <model type='virtio'/>
> > > > > >       <driver queues='2' rx_queue_size='512'>
> > > > > >         <host csum='on' tso4='on' tso6='on'/>
> > > > > >         <guest csum='on' tso4='on' tso6='on'/>
> > > > >
> > > > > I have other options set, but I don't think they are related:
> > > > >        <host csum='on' gso='off' tso4='on' tso6='on' ecn='off'
> > > > > ufo='off' mrg_rxbuf='on'/>
> > > > >        <guest csum='on' tso4='on' tso6='on' ecn='off' ufo='off'/>
> > > > >
> > > > >
> > > > > >       </driver>
> > > > > >       <alias name='net2'/>
> > > > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> > > > > > function='0x0'/>
> > > > > >     </interface>
> > > > > >
> > > > > > When VM boots up, turn on tx, tso and sg
> > > > > > # ethtool -K ens6 tx on
> > > > > > # ethtool -K ens6 tso on
> > > > > > # ethtool -K ens6 sg on
> > > > >
> > > > > All the needed offloading features are turned on by default,
> > > > > so I don't change anything in my testbed.
> > > > >
> > > > > > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another 
> > > > > > VM.
> > > > > > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` 
> > > > > > shows
> > > > > > that iperf server received packets with invalid TCP checksum.
> > > > > > `nstat -a` shows that TcpInCsumErr number is accumulating.
> > > > > >
> > > > > > After adding changes to VM's kernel as below, iperf works properly.
> > > > > > in tcp_v4_rcv()
> > > > > >       - if (skb_checksum_init(skb, IPPROTO_TCP, 
> > > > > > inet_compute_pseudo))
> > > > > >       + if (skb_checksum_init(skb, IPPROTO_TCP, 
> > > > > > inet_compute_pseudo))
> > > > > >
> > > > > > static inline bool tcp_checksum_complete(struct sk_buff *skb)
> > > > > > {
> > > > > >         return 0;
> > > > > > }
> > > > >
> > > > > That's odd. Which kernel is that? Maybe I can try the same version.
> > > > > I am using 5.2.14-200.fc30.x86_64.
> > > > >
> > > > > Looks like somehow the packet lost its offloading flags, then kernel
> > > > > has to check the csum and since it wasn't calculated before, it's
> > > > > just random garbage.
> > > > >
> > > > > fbl
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yifeng
> > > > > >
> > > > > > On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner <f...@sysclose.org> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote:
> > > > > > > > Hi Flavio,
> > > > > > > >
> > > > > > > > Thanks for the explanation. I followed the steps in the 
> > > > > > > > document but
> > > > > > > > TCP connection still failed to build between 2 VMs.
> > > > > > > >
> > > > > > > > I finally modified VM's kernel directly to disable TCP checksum 
> > > > > > > > validation
> > > > > > > > to get it working properly. I got 30.0Gbps for 'iperf' between 
> > > > > > > > 2 VMs.
> > > > > > >
> > > > > > > Could you provide more details on how you did that? What's running
> > > > > > > inside the VM?
> > > > > > >
> > > > > > > I don't change anything inside of the VMs (Linux) in my testbed.
> > > > > > >
> > > > > > > fbl
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yifeng
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner 
> > > > > > > > <f...@sysclose.org> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote:
> > > > > > > > > > Hi Ilya,
> > > > > > > > > >
> > > > > > > > > > Thanks for your reply.
> > > > > > > > > >
> > > > > > > > > > The thing is, if checksum offloading is enabled in both 
> > > > > > > > > > VMs, then
> > > > > > > > > > sender VM will send
> > > > > > > > > > a packet with invalid TCP checksum, and later OVS will send 
> > > > > > > > > > this
> > > > > > > > > > packet to receiver
> > > > > > > > > > VM directly without calculating a valid checksum. As a 
> > > > > > > > > > result,
> > > > > > > > > > receiver VM will drop
> > > > > > > > > > this packet because it contains invalid checksum. This is 
> > > > > > > > > > what
> > > > > > > > > > happened when I tried
> > > > > > > > > > this patch.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > When TSO is enabled, the TX checksumming offloading is 
> > > > > > > > > required,
> > > > > > > > > then you will see invalid checksum. This is well documented 
> > > > > > > > > here:
> > > > > > > > >
> > > > > > > > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso
> > > > > > > > >
> > > > > > > > > "Additionally, if the traffic is headed to a VM within the 
> > > > > > > > > same host
> > > > > > > > > further optimization can be expected. As the traffic never 
> > > > > > > > > leaves
> > > > > > > > > the machine, no MTU needs to be accounted for, and thus no
> > > > > > > > > segmentation and checksum calculations are required, which 
> > > > > > > > > saves yet
> > > > > > > > > more cycles."
> > > > > > > > >
> > > > > > > > > Therefore, it's expected to see bad csum in the traffic dumps.
> > > > > > > > >
> > > > > > > > > To use the feature, you need few steps: enable the feature in 
> > > > > > > > > OvS
> > > > > > > > > enable in qemu and inside the VM. The linux guest usually 
> > > > > > > > > enable
> > > > > > > > > the feature by default if qemu offers it.
> > > > > > > > >
> > > > > > > > > HTH,
> > > > > > > > > fbl
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yifeng
> > > > > > > > > >
> > > > > > > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets 
> > > > > > > > > > <i.maxim...@ovn.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 27.01.2020 18:24, Yifeng Sun wrote:
> > > > > > > > > > > > Hi Flavio,
> > > > > > > > > > > >
> > > > > > > > > > > > I am testing your patch using iperf between 2 VMs on 
> > > > > > > > > > > > the same host.
> > > > > > > > > > > > But it seems that TCP connection can't be created 
> > > > > > > > > > > > between these 2 VMs.
> > > > > > > > > > > > When inspecting further, I found that TCP packets have 
> > > > > > > > > > > > invalid checksums.
> > > > > > > > > > > > This might be the reason.
> > > > > > > > > > > >
> > > > > > > > > > > > I am wondering if I missed something in the setup? 
> > > > > > > > > > > > Thanks a lot.
> > > > > > > > > > >
> > > > > > > > > > > I didn't test myself, but according to current design, 
> > > > > > > > > > > checksum offloading
> > > > > > > > > > > (rx and tx) shuld be enabled in both VMs.  Otherwise all 
> > > > > > > > > > > the packets will
> > > > > > > > > > > be dropped by the guest kernel.
> > > > > > > > > > >
> > > > > > > > > > > Best regards, Ilya Maximets.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > fbl
> > > > > > >
> > > > > > > --
> > > > > > > fbl
> > > > >
> > > > > --
> > > > > fbl
> > >
> > > --
> > > fbl
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> --
> fbl
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to