Hi Rick, very helpful as always.
On Sat, Mar 22, 2014 at 6:18 PM, Rick Macklem <rmack...@uoguelph.ca> wrote: > Christopher Forgeron wrote: > > Well, you could try making if_hw_tsomax somewhat smaller. (I can't see > how the packet including ethernet header would be more than 64K with the > patch, but?? For example, the ether_output() code can call ng_output() > and I have no idea if that might grow the data size of the packet?) > That's what I was thinking - I was going to drop it down to 32k, which is extreme, but I wanted to see if it cured it or not. Something would have to be very broken to be adding nearly 32k to a packet. > To be honest, the optimum for NFS would be setting if_hw_tsomax == 56K, > since that would avoid the overhead of the m_defrag() calls. However, > it is suboptimal for other TCP transfers. > I'm very interested in NFS performance, so this is interesting to me - Do you have the time to educate me on this? I was going to spend this week hacking out the NFS server cache, as I feel ZFS does a better job, and my cache stats are always terrible, as to be expected when I have such a wide data usage on these sans. > > One other thing you could do (if you still have them) is scan the logs > for the code with my previous printf() patch and see if there is ever > a size > 65549 in it. If there is, then if_hw_tsomax needs to be smaller > by at least that size - 65549. (65535 + 14 == 65549) > There were some 65548's for sure. Interestingly enough, the amount that it ruptures by seems to be increasing slowly. I should possibly let it rupture and run for a long time to see if there is a steadily increasing pattern... perhaps something is accidentally incrementing the packet by say 4 bytes in a heavily loaded error condition. > > I'm not familiar enough with the mbuf/uma allocators to "confirm" it, > but I believe the "denied" refers to cases where m_getjcl() fails to get > a jumbo mbuf and returns NULL. > > If this were to happen in m_defrag(), it would return NULL and the ix > driver returns ENOBUFS, so this is not the case for EFBIG errors. > > BTW, the loop that your original printf code is in, just before the retry: goto label: That's an error loop, and it looks to me that all/most packets traverse it at some time? > I don't know if increasing the limits for the jumbo mbufs via sysctl > will help. If you are using the code without Jack's patch, which uses > 9K mbufs, then I think it can fragment the address space and result > in no 9K contiguous areas to allocate from. (I'm just going by what > Garrett and others have said about this.) > > I never seem to be running out of mbufs - 4k or 9k. Unless it's possible for a starvation to occur without incrementing the counters. Additionally, netstat -m is recording denied mbufs on boot, so on a 96 Gig system that is just starting up, I don't think I am.. but a large increase in the buffers is on my list of desperation things to try. Thanks for the hint on m_getjcl().. I'll dig around and see if I can find what's happening there. I guess it's time for me to learn basic dtrace as well. :-) _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"