Re: [Openstack] URGENT: packet loss on openstack instance

Satish Patel Thu, 27 Sep 2018 10:39:57 -0700

Hey Liping,

Follow up on this issue, i have configured SR-IOV and now i am not
seeing any packetloss or any latency issue.
On Mon, Sep 17, 2018 at 1:27 AM Liping Mao (limao) <li...@cisco.com> wrote:
>
> > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN
>
> traffic), so do i need to do anything in bond0 to enable VF/PF
>
> function? Just confused because currently my VM nic map with compute
>
> node br-vlan bridge.
>
>
>
> I had not actually used SRIOV in my env~ maybe others could help.
>
>
>
> Thanks,
>
> Liping Mao
>
>
>
> 在 2018/9/17 11:48，“Satish Patel”<satish....@gmail.com> 写入:
>
>
>
>     Thanks Liping,
>
>
>
>     I will check bug for tx/rx queue size and see if i can make it work
>
>     but look like my 10G NIC support SR-IOV so i am trying that path
>
>     because it will be better for long run.
>
>
>
>     I have deploy my cloud using openstack-ansible so now i need to figure
>
>     out how do i wire that up with openstack-ansible deployment, here is
>
>     the article [1]
>
>
>
>     Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN
>
>     traffic), so do i need to do anything in bond0 to enable VF/PF
>
>     function? Just confused because currently my VM nic map with compute
>
>     node br-vlan bridge.
>
>
>
>     [root@compute-65 ~]# lspci -nn | grep -i ethernet
>
>     03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10)
>
>     03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10)
>
>     03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>     03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II
>
>     BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af]
>
>
>
>
>
>     [1] 
> https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html
>
>     On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) <li...@cisco.com> 
> wrote:
>
>     >
>
>     > Hi Satish,
>
>     >
>
>     >
>
>     >
>
>     >
>
>     >
>
>     > There are hard limitations in nova's code, I did not actually used more 
> thant 8 queues:
>
>     >
>
>     >     def _get_max_tap_queues(self):
>
>     >
>
>     >         # NOTE(kengo.sakai): In kernels prior to 3.0,
>
>     >
>
>     >         # multiple queues on a tap interface is not supported.
>
>     >
>
>     >         # In kernels 3.x, the number of queues on a tap interface
>
>     >
>
>     >         # is limited to 8. From 4.0, the number is 256.
>
>     >
>
>     >         # See: https://bugs.launchpad.net/nova/+bug/1570631
>
>     >
>
>     >         kernel_version = int(os.uname()[2].split(".")[0])
>
>     >
>
>     >         if kernel_version <= 2:
>
>     >
>
>     >             return 1
>
>     >
>
>     >         elif kernel_version == 3:
>
>     >
>
>     >             return 8
>
>     >
>
>     >         elif kernel_version == 4:
>
>     >
>
>     >             return 256
>
>     >
>
>     >         else:
>
>     >
>
>     >             return None
>
>     >
>
>     >
>
>     >
>
>     > > I am currently playing with those setting and trying to generate
>
>     >
>
>     > traffic with hping3 tools, do you have any tool to test traffic
>
>     >
>
>     > performance for specially udp style small packets.
>
>     >
>
>     >
>
>     >
>
>     > Hping3 is good enough to reproduce it, we have app level test tool, but 
> that is not your case.
>
>     >
>
>     >
>
>     >
>
>     >
>
>     >
>
>     > >     Here i am trying to increase rx_queue_size & tx_queue_size but 
> its not
>
>     >
>
>     >     working somehow. I have tired following.
>
>     >
>
>     >
>
>     >
>
>     > Since you are not rocky code, it should only works in qemu.conf, maybe 
> check if this bug[1] affect you.
>
>     >
>
>     >
>
>     >
>
>     >
>
>     >
>
>     > > Is there a way i can automate this last task to update queue number
>
>     >
>
>     > action after reboot vm :) otherwise i can use cloud-init to make sure
>
>     >
>
>     > all VM build with same config.
>
>     >
>
>     >
>
>     >
>
>     > Cloud-init or rc.local could be the place to do that.
>
>     >
>
>     >
>
>     >
>
>     > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960
>
>     >
>
>     >
>
>     >
>
>     > Regards,
>
>     >
>
>     > Liping Mao
>
>     >
>
>     >
>
>     >
>
>     > 在 2018/9/17 04:09，“Satish Patel”<satish....@gmail.com> 写入:
>
>     >
>
>     >
>
>     >
>
>     >     Update on my last email.
>
>     >
>
>     >
>
>     >
>
>     >     I am able to achieve 150kpps with queue=8 and my goal is to do 
> 300kpps
>
>     >
>
>     >     because some of voice application using 300kps.
>
>     >
>
>     >
>
>     >
>
>     >     Here i am trying to increase rx_queue_size & tx_queue_size but its 
> not
>
>     >
>
>     >     working somehow. I have tired following.
>
>     >
>
>     >
>
>     >
>
>     >     1. add rx/tx size in /etc/nova/nova.conf  in libvirt section  - 
> (didn't work)
>
>     >
>
>     >     2. add /etc/libvirtd/qemu.conf - (didn't work)
>
>     >
>
>     >
>
>     >
>
>     >     I have try to edit virsh edit <XML> file but somehow my changes not
>
>     >
>
>     >     getting reflected, i did virsh define <XML> after change and hard
>
>     >
>
>     >     reboot guest but no luck.. how do i edit that option in xml if i 
> want
>
>     >
>
>     >     to do that?
>
>     >
>
>     >     On Sun, Sep 16, 2018 at 1:41 PM Satish Patel <satish....@gmail.com> 
> wrote:
>
>     >
>
>     >     >
>
>     >
>
>     >     > I successful reproduce this error with hping3 tool and look like
>
>     >
>
>     >     > multiqueue is our solution :) but i have few question you may have
>
>     >
>
>     >     > answer of that.
>
>     >
>
>     >     >
>
>     >
>
>     >     > 1. I have created two instance  (vm1.example.com & 
> vm2.example.com)
>
>     >
>
>     >     >
>
>     >
>
>     >     > 2. I have flood traffic from vm1 using "hping3 vm2.example.com
>
>     >
>
>     >     > --flood"  and i have noticed drops on tap interface. ( This is 
> without
>
>     >
>
>     >     > multiqueue)
>
>     >
>
>     >     >
>
>     >
>
>     >     > 3. Enable multiqueue in image and run same test and again got 
> packet
>
>     >
>
>     >     > drops on tap interface ( I didn't update queue on vm2 guest, so
>
>     >
>
>     >     > definitely i was expecting packet drops)
>
>     >
>
>     >     >
>
>     >
>
>     >     > 4. Now i have try to update vm2 queue using ethtool and i got
>
>     >
>
>     >     > following error, I have 15vCPU and i was trying to add 15 queue
>
>     >
>
>     >     >
>
>     >
>
>     >     > [root@bar-mq ~]# ethtool -L eth0 combined 15
>
>     >
>
>     >     > Cannot set device channel parameters: Invalid argument
>
>     >
>
>     >     >
>
>     >
>
>     >     > Then i have tried 8 queue which works.
>
>     >
>
>     >     >
>
>     >
>
>     >     > [root@bar-mq ~]# ethtool -L eth0 combined 8
>
>     >
>
>     >     > combined unmodified, ignoring
>
>     >
>
>     >     > no channel parameters changed, aborting
>
>     >
>
>     >     > current values: tx 0 rx 0 other 0 combined 8
>
>     >
>
>     >     >
>
>     >
>
>     >     > Now i am not seeing any packet drops on tap interface, I have 
> measure
>
>     >
>
>     >     > PPS and i was able to get 160kpps without packet drops.
>
>     >
>
>     >     >
>
>     >
>
>     >     > Question:
>
>     >
>
>     >     >
>
>     >
>
>     >     > 1. why i am not able to add 15 queue?  ( is this NIC or driver 
> limitation?)
>
>     >
>
>     >     > 2. how do i automate "ethtool -L eth0 combined 8" command in 
> instance
>
>     >
>
>     >     > so i don't need to tell my customer to do this manually?
>
>     >
>
>     >     > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel 
> <satish....@gmail.com> wrote:
>
>     >
>
>     >     > >
>
>     >
>
>     >     > > Hi Liping,
>
>     >
>
>     >     > >
>
>     >
>
>     >     > > >> I think multi queue feature should help.（be careful to make 
> sure the ethtool update queue number action also did after reboot the vm）.
>
>     >
>
>     >     > >
>
>     >
>
>     >     > > Is there a way i can automate this last task to update queue 
> number
>
>     >
>
>     >     > > action after reboot vm :) otherwise i can use cloud-init to 
> make sure
>
>     >
>
>     >     > > all VM build with same config.
>
>     >
>
>     >     > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel 
> <satish....@gmail.com> wrote:
>
>     >
>
>     >     > > >
>
>     >
>
>     >     > > > I am currently playing with those setting and trying to 
> generate
>
>     >
>
>     >     > > > traffic with hping3 tools, do you have any tool to test 
> traffic
>
>     >
>
>     >     > > > performance for specially udp style small packets.
>
>     >
>
>     >     > > >
>
>     >
>
>     >     > > > I am going to share all my result and see what do you feel 
> because i
>
>     >
>
>     >     > > > have noticed you went through this pain :)  I will try every 
> single
>
>     >
>
>     >     > > > option which you suggested to make sure we are good before i 
> move
>
>     >
>
>     >     > > > forward to production.
>
>     >
>
>     >     > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) 
> <li...@cisco.com> wrote:
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > > I think multi queue feature should help.（be careful to make 
> sure the ethtool update queue number action also did after reboot the vm）.
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > > Numa cpu pin and queue length will be a plus in my exp. You 
> may need yo have performance test in your situatuon，in my case cpu numa 
> helpped the app get very stable 720p/1080p transcoding performance. Not sure 
> if your app get benifit.
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > > You are not using L3，this will let you avoid a lot of 
> performance issue. And since only two instance with 80kpps packets，so in your 
> case，HW interface should not be bottleneck too. And your Nexus 5k/7k will not 
> be bottleneck for sure ;-)
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > > Thanks，
>
>     >
>
>     >     > > > > Liping Mao
>
>     >
>
>     >     > > > >
>
>     >
>
>     >     > > > > > 在 2018年9月16日，23:09，Satish Patel <satish....@gmail.com> 写道：
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > Thanks Liping,
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > I am using libvertd 3.9.0 version so look like i am 
> eligible take
>
>     >
>
>     >     > > > > > advantage of that feature. phew!
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > [root@compute-47 ~]# libvirtd -V
>
>     >
>
>     >     > > > > > libvirtd (libvirt) 3.9.0
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > Let me tell you how i am running instance on my 
> openstack, my compute
>
>     >
>
>     >     > > > > > has 32 core / 32G memory  and i have created two instance 
> on compute
>
>     >
>
>     >     > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu 
> core, i have
>
>     >
>
>     >     > > > > > kept 2 core for compute node). on compute node i disabled 
> overcommit
>
>     >
>
>     >     > > > > > using ratio (1.0)
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > I didn't configure NUMA yet because i wasn't aware of 
> this feature, as
>
>     >
>
>     >     > > > > > per your last post do you think numa will help to fix 
> this issue?
>
>     >
>
>     >     > > > > > following is my numa view
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > [root@compute-47 ~]# numactl --hardware
>
>     >
>
>     >     > > > > > available: 2 nodes (0-1)
>
>     >
>
>     >     > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
>
>     >
>
>     >     > > > > > node 0 size: 16349 MB
>
>     >
>
>     >     > > > > > node 0 free: 133 MB
>
>     >
>
>     >     > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
>
>     >
>
>     >     > > > > > node 1 size: 16383 MB
>
>     >
>
>     >     > > > > > node 1 free: 317 MB
>
>     >
>
>     >     > > > > > node distances:
>
>     >
>
>     >     > > > > > node   0   1
>
>     >
>
>     >     > > > > >  0:  10  20
>
>     >
>
>     >     > > > > >  1:  20  10
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > I am not using any L3 router, i am using provide VLAN 
> network and
>
>     >
>
>     >     > > > > > using Cisco Nexus switch for my L3 function so i am not 
> seeing any
>
>     >
>
>     >     > > > > > bottleneck there.
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > This is the 10G NIC i have on all my compute node, dual 
> 10G port with
>
>     >
>
>     >     > > > > > bonding (20G)
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme 
> II BCM57810 10
>
>     >
>
>     >     > > > > > Gigabit Ethernet (rev 10)
>
>     >
>
>     >     > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme 
> II BCM57810 10
>
>     >
>
>     >     > > > > > Gigabit Ethernet (rev 10)
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > >
>
>     >
>
>     >     > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) 
> <li...@cisco.com> wrote:
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> It is still possible to update rx and tx queues length 
> if your qemu and libvirt version is higher than the version recorded in [3]. 
> （You should possible to update directly in libvirt configuration if my memory 
> is correct）
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> We also have some similar use case which run audio/vedio 
> serivcs. They are CPU consuming and have UDP small packets. Another possible 
> tunning is using CPU pin for the vm.  you can use numa awared cpu feature to 
> get stable cpu performance ，vm network dropped packets sometimes because of 
> the vm cpu is too busy，with numa cpu it works better performance，our way is 
> similar with [a]. You need to create flavor with special metadata and 
> dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media 
> service. It makes the CPU performance stable.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Another packet loss case we get is because of vm kernel, 
> some of our app are using 32bit OS, that cause memory issue, when traffic 
> larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 
> 32bit os can actually use very limited memory, we have to add swap for the 
> vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of 
> trouble.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> BTW,if you are using vrouter on L3, you’d better to move 
> provider network(no vrouter). I did not tried DVR, but if you are running 
> without DVR, the L3 node will be bottleneck very quick. Especially default 
> iptables conntrack is 65535, you will reach to it and drop packet on L3, even 
> after you tun that value, it still hard to more that 1Mpps for your network 
> node.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> If your App more than 200kpps per compute node, you may 
> be better also have a look your physical network driver tx/rx configuration. 
> Most of the HW default value for tx/rx queues number and length are very 
> poor,you may start to get packet on eth interface on physical host when rx 
> queue is full.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 
> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Regards，
>
>     >
>
>     >     > > > > >> Liping Mao
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 在 2018年9月16日，21:18，Satish Patel <satish....@gmail.com> 
> 写道：
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Hi Liping,
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Thank you for your reply,
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> We notice packet drops during high load, I did try 
> txqueue and didn't help so I believe I am going to try miltiqueue.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> For SRIOV I have to look if I have support in my nic.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> We are using queens so I think queue size option  not 
> possible :(
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> We are using voip application and traffic is udp so our 
> pps rate is 60k to 80k per vm instance.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> I will share my result as soon as I try multiqueue.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Sent from my iPhone
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) 
> <li...@cisco.com> wrote:
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Hi Satish,
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Did your packet loss happen always or it only happened 
> when heavy load?
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> AFAIK, if you do not tun anything, the vm tap can 
> process about 50kpps before the tap device start to drop packets.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> If it happened in heavy load, couple of things you can 
> try:
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 1) increase tap queue length, usually the default value 
> is 500, you can try larger. (seems like you already tried)
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 2) Try to use virtio multi queues feature , see [1]. 
> Virtio use one queue for rx/tx in vm, with this feature you can get more 
> queues. You can check
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 3) In rock version, you can use [2] to increase virtio 
> queue size, the default queues size is 256/512, you may increase it to 1024, 
> this would help to increase pps of the tap device.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> If all these things can not get your network performance 
> requirement, you may need to move to use dpdk / sriov stuff to get more vm 
> performance.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> I did not actually used them in our env, you may refer 
> to [3]
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> [1] 
> https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> [2] 
> https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> [3] 
> https://docs.openstack.org/ocata/networking-guide/config-sriov.html
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Regards,
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Liping Mao
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> 在 2018/9/16 13:07，“Satish Patel”<satish....@gmail.com> 
> 写入:
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  [root@compute-33 ~]# ifconfig tap5af7f525-5f | grep -i 
> drop
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>          RX errors 0 dropped 0 overruns 0 frame 0
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>          TX errors 0 dropped 2528788837 overruns 0 
> carrier 0 collisions 0
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  Noticed tap interface dropping TX packets and even 
> after increasing
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  txqueue from 1000 to 10000 nothing changed, still 
> getting packet
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  drops.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  On Sat, Sep 15, 2018 at 4:22 PM Satish Patel 
> <satish....@gmail.com> wrote:
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> Folks,
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> I need some advice or suggestion to find out what is 
> going on with my
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> network, we have notice high packet loss on openstack 
> instance and not
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> sure what is going on, same time if i check on host 
> machine and it has
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> zero packet loss.. this is what i did for test...
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> ping 8.8.8.8
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> from instance: 50% packet loss
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> from compute host: 0% packet loss
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> I have disabled TSO/GSO/SG setting on physical compute 
> node but still
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> getting packet loss.
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> We have 10G NIC on our network, look like something 
> related to tap
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >> interface setting..
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  _______________________________________________
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  Mailing list: 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  Post to     : openstack@lists.openstack.org
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>  Unsubscribe : 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >     > > > > >>
>
>     >
>
>     >
>
>     >
>
>     >
>
>
>
>


_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] URGENT: packet loss on openstack instance

Reply via email to