I get my message has been automatically rejected by the 
openstack-operators-ow...@lists.openstack.org<mailto:openstack-operators-ow...@lists.openstack.org>.
Resend.



Hi John,

Did you know where the packet dropped? On physical interface / tap device / ovs 
port or in the vm.

We hit udp packet loss when the large pps. The following things you may double 
check:

1.      Double check if your physical interface is dropping packet.  Usually if 
you rx queue ring size or rx queue number is default value ,it will drop udp 
packet . It will start to drop packet if it reach to about 200kpps in one cpu 
core(rss will distribute traffic to different core, for one single core, it 
will drop packet about 200kpps in my exp).

Usually you can get the statics from ethtool -S interface to check if there is 
packet loss because of rx queue full. And use ethtool to increase your ring 
size. I tested in my environment that if ring size increase from 512 to 4096, 
it can double the throughput from 200kpps to 400kpps in one cpu core. This may 
help in some case.



2.      Double check if your TAP device dropped packet, the default tx_queue 
length is 500 or 1000, increase it to 10000 may help in some case.


3.      Double check your nf_conntrack_max in compute node and network node, 
the default value is 65535, in our case it usually reach to 500k-1m . we change 
it as following:
net.netfilter.nf_conntrack_max=10240000
net.nf_conntrack_max=10240000
if you see , something like “nf_conntrack: table full, dropping packet” in your 
/var/log/message log, that means you hit this one.


4.      You could check if drop happened inside your vm, increase the following 
param maybe help in some case:

net.core.rmem_max / net.core.rmem_default / net.core.wmem_max / 
net.core.rmem_default


5.      If you are using default network driver(virtio-net), you can double 
check if your vhost of your vm is full with CPU soft irq. You can find it by 
the process name is vhost-$PID_OF_YOUR_VM . In this case, if you can try the 
following feature in “L”:

https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html
      multi-queue may help you some case, but it will use more vhost and more 
cpu in your host.


6.      Sometimes cpu numa pin can also help, but you need to reserve them and 
static plan you cpu.

I think we should figure out the packet lost in where and which is the 
bottleneck. Hope this help,  John.
Thanks.

Regards,
Liping Mao

发件人: John Petrini <jpetr...@coredial.com>
日期: 2017年7月28日 星期五 03:35
至: Pedro Sousa <pgso...@gmail.com>, OpenStack Mailing List 
<openst...@lists.openstack.org>, "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
主题: Re: [Openstack] [Openstack-operators] UDP Buffer Filling

Hi Pedro,

Thank you for the suggestion. I will look into this.


John Petrini

Platforms Engineer   //   CoreDial, LLC   //   
coredial.com<http://coredial.com/>   //   [itter] 
<https://twitter.com/coredial>    [nkedIn] 
<http://www.linkedin.com/company/99631>    [ogle Plus] 
<https://plus.google.com/104062177220750809525/posts>    [og] 
<http://success.coredial.com/blog>
751 Arbor Way, Hillcrest I, Suite 150, Blue Bell, PA 19422
P: 215.297.4400 x232   //   F: 215.297.4401   //   E: 
jpetr...@coredial.com<mailto:jpetr...@coredial.com>

On Thu, Jul 27, 2017 at 12:25 PM, Pedro Sousa 
<pgso...@gmail.com<mailto:pgso...@gmail.com>> wrote:
Hi,

have you considered to implement some network acceleration technique like to 
OVS-DPDK or SR-IOV?

In these kind of workloads (voice, video) that have low latency requirements 
you might need to use something like DPDK to avoid these issues.

Regards

On Thu, Jul 27, 2017 at 4:49 PM, John Petrini 
<jpetr...@coredial.com<mailto:jpetr...@coredial.com>> wrote:
Hi List,

We are running Mitaka with VLAN provider networking. We've recently encountered 
a problem where the UDP receive queue on instances is filling up and we begin 
dropping packets. Moving instances out of OpenStack onto bare metal resolves 
the issue completely.

These instances are running asterisk which should be pulling these packets off 
the queue but it appears to be falling behind no matter the resources we give 
it.

We can't seem to pin down a reason why we would see this behavior in KVM but 
not on metal. I'm hoping someone on the list might have some insight or ideas.

Thank You,

John

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org<mailto:OpenStack-operators@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to