Re: Sudden mbuf demand increase and shortage under the load

Maxim Sobolev Mon, 15 Feb 2010 22:20:49 -0800

Sergey Babkin wrote:

Maxim Sobolev wrote:

Hi,


Our company have a FreeBSD based product that consists of the numerous
interconnected processes and it does some high-PPS UDP processing
(30-50K PPS is not uncommon). We are seeing some strange periodic
failures under the load in several such systems, which usually evidences
itself in IPC (even through unix domain sockets) suddenly either
breaking down or pausing and restoring only some time later (like 5-10
minutes). The only sign of failure I managed to find was the increase of
the "requests for mbufs denied" in the netstat -m and number of total
mbuf clusters (nmbclusters) raising up to the limit.


As a simple idea: UDP is not flow-controlled. So potentially

nothing stops an application from sending the packets as fastas it can. If it's faster than the network card can process,

they would start collecting. So this might be worth a try
as a way to reproduce the problem and see if the system has
a safeguard against it or not.

Another possibility: what happens if a process is bound to
an UDP socket but doesn't actually read the data from it?
FreeBSD used to be pretty good at it, just throwing away
the data beyond a certain limit, SVR4 was running out of
network memory. But it might have changed, so might be
worth a look too.

Thanks. Yes, the latter could be actually the case. The former is lesslikely since the system doesn't generate so much traffic by itself, butrather relays what it receives from the network pretty much in 1:1ratio. It could happen though, if somehow the output path has beenstalled. However, netstat -I igb0 shows zero Oerrs, which I guess meansthat we can rule that out too, unless there is some bug in the driver.

So we are looking for potential issues that can cause UDP forwardingapplication to stall and not dequeue packets on time. So far we haveidentified some culprits in application logic that can cause such stallsin the unlikely event of gettimeofday() time going backwards. I've seensome messages from ntpd around the time of the problem, although it'sunclear whether those are result of the that mbuf shortage or couldindicate the root issue. We've also added some debug output to catch anyabnormalities in the processing times.

In any case I am a little bit surprised on how easy the FreeBSD can letmbuf storage to overflow. I'd expect it to be more aggressive indropping things received from network once one application stalls.Combined with the fact that we apparently use shared storage fordifferent kinds of network activity and perhaps IPC too, this gives aneasy opportunity for DOS attacks. To me, separate limits for separateprotocols or even classes of traffic (i.e. local/remote) would make muchsense.

Thanks to everybody for useful tips and suggestions, I will do moreresearch along the lines and let you know once we either resolve thecase or when I have more diagnostic output.


Regards,
--
Maksym Sobolyev
Sippy Software, Inc.
Internet Telephony (VoIP) Experts
T/F: +1-646-651-1110
Web: http://www.sippysoft.com
MSN: sa...@sippysoft.com
Skype: SippySoft
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: Sudden mbuf demand increase and shortage under the load

Reply via email to